-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathatom.xml
More file actions
518 lines (293 loc) · 176 KB
/
atom.xml
File metadata and controls
518 lines (293 loc) · 176 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Kirio</title>
<link href="https://luochenxi.github.io/atom.xml" rel="self"/>
<link href="https://luochenxi.github.io/"/>
<updated>2020-08-24T10:00:00.000Z</updated>
<id>https://luochenxi.github.io/</id>
<author>
<name>luochenxi</name>
</author>
<generator uri="https://hexo.io/">Hexo</generator>
<entry>
<title>mac install neovim</title>
<link href="https://luochenxi.github.io/2020/08/24/yuque/mac%20install%20neovim/"/>
<id>https://luochenxi.github.io/2020/08/24/yuque/mac%20install%20neovim/</id>
<published>2020-08-24T10:00:00.000Z</published>
<updated>2020-08-24T10:00:00.000Z</updated>
<content type="html"><![CDATA[<h4 id="install-neovim"><a href="#install-neovim" class="headerlink" title="install neovim"></a>install neovim</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">brew install neovim</span><br></pre></td></tr></table></figure><h4 id="install-vim-plug"><a href="#install-vim-plug" class="headerlink" title="install vim-plug"></a>install vim-plug</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">curl -fLo ~/.local/share/nvim/site/autoload/plug.vim --create-dirs \</span><br><span class="line"> https://raw.githubusercontent.com/junegunn/vim-plug/master/plug.vim</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h4 id="install-neovim"><a href="#install-neovim" class="headerlink" title="install neovim"></a>install neovim</h4><figure class="highlight </summary>
<category term="neovim" scheme="https://luochenxi.github.io/categories/neovim/"/>
<category term="neovim" scheme="https://luochenxi.github.io/tags/neovim/"/>
</entry>
<entry>
<title>redis批量删除key</title>
<link href="https://luochenxi.github.io/2020/08/17/yuque/redis%E6%89%B9%E9%87%8F%E5%88%A0%E9%99%A4key/"/>
<id>https://luochenxi.github.io/2020/08/17/yuque/redis%E6%89%B9%E9%87%8F%E5%88%A0%E9%99%A4key/</id>
<published>2020-08-17T07:49:00.000Z</published>
<updated>2020-08-17T07:49:00.000Z</updated>
<content type="html"><![CDATA[<ul><li><strong>背景</strong></li></ul><p>当 redis 中的数据过多时需要手动删除旧数据。shell 命令如下。</p><ul><li>shell 命令</li></ul><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">redis-cli -n 3 -h m8774.test.m.com -p 8774 -a "password" --scan --pattern '*'|xargs redis-cli -n 3 -h m8774.test.m.com -p 8774 -a "password" del</span><br><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash"> 参数说明</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> -n DB 指定数据库</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> -h IP or host</span></span><br><span class="line"><span class="meta">#</span><span class="bash"> -a <span class="string">"password"</span></span></span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><ul>
<li><strong>背景</strong></li>
</ul>
<p>当 redis 中的数据过多时需要手动删除旧数据。shell 命令如下。</p>
<ul>
<li>shell 命令</li>
</ul>
<figure class="highlight sh</summary>
<category term="redis" scheme="https://luochenxi.github.io/categories/redis/"/>
<category term="redis" scheme="https://luochenxi.github.io/tags/redis/"/>
<category term="scan" scheme="https://luochenxi.github.io/tags/scan/"/>
</entry>
<entry>
<title>Hive数据去重及row_number()</title>
<link href="https://luochenxi.github.io/2020/07/08/yuque/Hive%E6%95%B0%E6%8D%AE%E5%8E%BB%E9%87%8D%E5%8F%8Arow_number()/"/>
<id>https://luochenxi.github.io/2020/07/08/yuque/Hive%E6%95%B0%E6%8D%AE%E5%8E%BB%E9%87%8D%E5%8F%8Arow_number()/</id>
<published>2020-07-08T07:21:00.000Z</published>
<updated>2017-07-08T07:21:00.000Z</updated>
<content type="html"><![CDATA[<h3 id="业务需求:"><a href="#业务需求:" class="headerlink" title="业务需求:"></a>业务需求:</h3><p>hive 数据去重,并根据需求取其中一条</p><p>数据案例:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">序号 feed_id create_time tag_time</span><br><span class="line">1 6684032 1594137265 1593608403</span><br><span class="line">2 6684032 1594137263 1593608403</span><br><span class="line">3 6684056 1594137276 1594137276</span><br></pre></td></tr></table></figure><p>只需要 1,3 两行的记录,因为第 2 行的 tag_time 和第一行的重复了,所以需要将 2 行重复的去掉。</p><h3 id="方案:"><a href="#方案:" class="headerlink" title="方案:"></a>方案:</h3><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span></span><br><span class="line"> *</span><br><span class="line"><span class="keyword">from</span>(</span><br><span class="line"> <span class="keyword">select</span></span><br><span class="line"> *,</span><br><span class="line"> row_number() <span class="keyword">over</span> (</span><br><span class="line"> <span class="keyword">partition</span> <span class="keyword">by</span> feed_id</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span></span><br><span class="line"> tag_time <span class="keyword">desc</span></span><br><span class="line"> ) <span class="keyword">num</span></span><br><span class="line"> <span class="keyword">from</span></span><br><span class="line"> <span class="keyword">table</span></span><br><span class="line"> ) t wheret.num = <span class="number">1</span>;</span><br></pre></td></tr></table></figure><p>分析:<br>row_number()over (partition by feed_id order by tag_time desc) num 取 num=1 的 <br>意思是先根据 feed_id 进行分组,并在分组内部按 tag_time 降序排序,row_number()函数计算的值就表示某个 feed_id 组内部排序后的顺序编号(该编号在一个组内是连续并且唯一的) 。<br>所以最后直接去每个分组内的第一个(num=1)即可。</p><p>PS:<br>ROW_NUMBER() OVER 函数的基本用法 <br>语法:ROW_NUMBER() OVER(PARTITION BY COLUMN ORDER BY COLUMN)<br>简单的说 row_number()从 1 开始,为每一个分组记录返回一个数字,这里的 ROW_NUMBER() OVER (ORDER BY xyz DESC) 是先把 xyz 列降序,再为降序以后的每条 xlh 记录返回一个序号。<br>示例:</p><figure class="highlight css"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="selector-tag">xyz</span> <span class="selector-tag">row_num</span></span><br><span class="line">600 1</span><br><span class="line">400 2</span><br><span class="line">300 3</span><br><span class="line">200 4</span><br></pre></td></tr></table></figure><ul><li>参考链接:<ul><li><a href="https://blog.csdn.net/yimingsilence/java/article/details/70140877">Hive 数据去重</a></li></ul></li></ul>]]></content>
<summary type="html"><h3 id="业务需求:"><a href="#业务需求:" class="headerlink" title="业务需求:"></a>业务需求:</h3><p>hive 数据去重,并根据需求取其中一条</p>
<p>数据案例:</p>
<figure class="highl</summary>
<category term="hive" scheme="https://luochenxi.github.io/categories/hive/"/>
<category term="hive" scheme="https://luochenxi.github.io/tags/hive/"/>
<category term="数据去重" scheme="https://luochenxi.github.io/tags/%E6%95%B0%E6%8D%AE%E5%8E%BB%E9%87%8D/"/>
</entry>
<entry>
<title>Mac&amp;Linux在命令行中预览图片</title>
<link href="https://luochenxi.github.io/2019/11/18/yuque/Mac&Linux%E5%9C%A8%E5%91%BD%E4%BB%A4%E8%A1%8C%E4%B8%AD%E9%A2%84%E8%A7%88%E5%9B%BE%E7%89%87/"/>
<id>https://luochenxi.github.io/2019/11/18/yuque/Mac&Linux%E5%9C%A8%E5%91%BD%E4%BB%A4%E8%A1%8C%E4%B8%AD%E9%A2%84%E8%A7%88%E5%9B%BE%E7%89%87/</id>
<published>2019-11-18T03:06:00.000Z</published>
<updated>2020-08-25T10:08:52.340Z</updated>
<content type="html"><![CDATA[<h3 id="Preface"><a href="#Preface" class="headerlink" title="Preface"></a>Preface</h3><blockquote><p>在[Mac/Linux]命令行中查看中预览图片</p></blockquote><h3 id="Install"><a href="#Install" class="headerlink" title="Install"></a>Install</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pip install imgcat</span><br></pre></td></tr></table></figure><p><! –more–></p><h3 id="Usage"><a href="#Usage" class="headerlink" title="Usage"></a>Usage</h3><p><strong>shell</strong></p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">imgcat ZPepx5YtlY3a6nG3waau0Kk4YqWEK.jpg</span><br></pre></td></tr></table></figure><p><strong>code</strong></p><p> class="lazyload" data-src="https://cdn.nlark.com/yuque/0/2019/png/283236/1574055736799-675cd814-98a6-419e-a4e1-aed94abdd3d6.png#align=left&display=inline&height=1017&margin=%5Bobject%20Object%5D&originHeight=1017&originWidth=948&status=done&style=none&width=948" <img src="/"></p><h3 id="Show"><a href="#Show" class="headerlink" title="Show"></a>Show</h3><p> class="lazyload" data-src="https://cdn.nlark.com/yuque/0/2019/png/283236/1574048591610-54099f78-e25a-4974-a290-6443e893eaa8.png#align=left&display=inline&height=1736&margin=%5Bobject%20Object%5D&originHeight=1736&originWidth=2342&status=done&style=none&width=2342" <img src="/"></p>]]></content>
<summary type="html"><h3 id="Preface"><a href="#Preface" class="headerlink" title="Preface"></a>Preface</h3><blockquote>
<p>在[Mac/Linux]命令行中查看中预览图片</p>
</blockqu</summary>
<category term="mac" scheme="https://luochenxi.github.io/categories/mac/"/>
<category term="mac" scheme="https://luochenxi.github.io/tags/mac/"/>
<category term="imgcat" scheme="https://luochenxi.github.io/tags/imgcat/"/>
</entry>
<entry>
<title>Mongo副本集client读写分离和故障迁移</title>
<link href="https://luochenxi.github.io/2019/10/12/yuque/Mongo%E5%89%AF%E6%9C%AC%E9%9B%86client%E8%AF%BB%E5%86%99%E5%88%86%E7%A6%BB%E5%92%8C%E6%95%85%E9%9A%9C%E8%BF%81%E7%A7%BB/"/>
<id>https://luochenxi.github.io/2019/10/12/yuque/Mongo%E5%89%AF%E6%9C%AC%E9%9B%86client%E8%AF%BB%E5%86%99%E5%88%86%E7%A6%BB%E5%92%8C%E6%95%85%E9%9A%9C%E8%BF%81%E7%A7%BB/</id>
<published>2019-10-12T07:13:00.000Z</published>
<updated>2020-08-25T10:08:52.364Z</updated>
<content type="html"><![CDATA[<h2 id="前言:"><a href="#前言:" class="headerlink" title="前言:"></a>前言:</h2><p>用 pymongo 连接 mongo 副本集(Replica Set)从而读写分离以及主备切换进而解决主节点故障问题。</p><h2 id="副本集实例-Connection-String-URI-连接示例"><a href="#副本集实例-Connection-String-URI-连接示例" class="headerlink" title="副本集实例 Connection String URI 连接示例"></a>副本集实例 Connection String URI 连接示例</h2><ol><li>获取副本集实例的 Connection String URI 连接信息,详情请参考[<a href="http://docs.mongodb.org/manual/reference/connection-string">Connection String URI</a>]</li></ol><p>ConnectionString 主要内容:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">mongodb://username:password@mongodb0.example.com:27017,mongodb1.example.com:27017,mongodb2.example.com:27017/admin?replicaSet=myRepl&authSource=admin</span><br></pre></td></tr></table></figure><ul><li>replicaSet : 指定的名称<a href="https://docs.mongodb.com/manual/reference/glossary/#term-replica-set">副本集</a></li><li>authSource: 指定与用户凭据关联的数据库名称。<a href="https://docs.mongodb.com/manual/reference/connection-string/#urioption.authSource"><code>authSource</code></a> 默认为连接字符串中指定的数据库。</li></ul><ol start="2"><li>读写分离:</li></ol><p>要实现读写分离,需要在 Connection String URI 的<strong>options</strong>里添加<code>readPreference=secondaryPreferred</code>,设置读请求为 Secondary 节点优先。<br>更多读选项请参考<a href="https://docs.mongodb.com/manual/core/read-preference/">Read preferences</a>。</p><a id="more"></a><p>示例</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">mongodb://username:password@dds-xxxxxxxxxxxx:3007,xxxxxxxxxxxx:3007/admin?replicaSet=mgset-xxxxxx&readPreference=secondaryPreferred</span><br></pre></td></tr></table></figure><p>通过以上 Connection String 来连接 MongoDB 副本集实例,读请求将优先发给 Secondary 节点实现读写分离。同时客户端会自动检测节点的主备关系,当主备关系发生变化时,自动将写操作切换到新的 Primary 节点上,以保证服务的高可用。</p><h2 id="代码实现:"><a href="#代码实现:" class="headerlink" title="代码实现:"></a>代码实现:</h2><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_mongo_conn_url_replicaset</span>(<span class="params">ip_port_list, user=None, pwd=None, set_name=None,set_authSource=None</span>):</span></span><br><span class="line"> url = <span class="string">'mongodb://'</span></span><br><span class="line"> <span class="keyword">if</span> user <span class="keyword">is</span> <span class="keyword">not</span> <span class="literal">None</span>:</span><br><span class="line"> <span class="keyword">if</span> pwd <span class="keyword">is</span> <span class="literal">None</span>:</span><br><span class="line"> pwd = user</span><br><span class="line"> url += <span class="string">'%s:%s@'</span> % (user, pwd)</span><br><span class="line"> url += <span class="string">','</span>.join(ip_port_list)</span><br><span class="line"> <span class="keyword">if</span> set_name <span class="keyword">is</span> <span class="keyword">not</span> <span class="literal">None</span>:</span><br><span class="line"> url += <span class="string">'/?replicaSet=%s&authSource=%s&readPreference=secondaryPreferred'</span> % (set_name,set_authSource)</span><br><span class="line"> <span class="keyword">return</span> url</span><br><span class="line"></span><br><span class="line">ip_port_list = [<span class="string">'m3007.test.mongodb.m.com:3777'</span>, <span class="string">'s3007.test1.mongodb.m.com:3777'</span>, <span class="string">'s3007.test2.mongodb.m.com:3777'</span>]</span><br><span class="line">conn_url = get_mongo_conn_url_replicaset(ip_port_list, <span class="string">"username"</span>, <span class="string">"password"</span>, <span class="string">"rs_3777"</span>,<span class="string">"admin"</span>)</span><br><span class="line">cli = MongoClient(conn_url)</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h2 id="前言:"><a href="#前言:" class="headerlink" title="前言:"></a>前言:</h2><p>用 pymongo 连接 mongo 副本集(Replica Set)从而读写分离以及主备切换进而解决主节点故障问题。</p>
<h2 id="副本集实例-Connection-String-URI-连接示例"><a href="#副本集实例-Connection-String-URI-连接示例" class="headerlink" title="副本集实例 Connection String URI 连接示例"></a>副本集实例 Connection String URI 连接示例</h2><ol>
<li>获取副本集实例的 Connection String URI 连接信息,详情请参考[<a href="http://docs.mongodb.org/manual/reference/connection-string">Connection String URI</a>]</li>
</ol>
<p>ConnectionString 主要内容:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">mongodb:&#x2F;&#x2F;username:password@mongodb0.example.com:27017,mongodb1.example.com:27017,mongodb2.example.com:27017&#x2F;admin?replicaSet&#x3D;myRepl&amp;authSource&#x3D;admin</span><br></pre></td></tr></table></figure>
<ul>
<li>replicaSet : 指定的名称<a href="https://docs.mongodb.com/manual/reference/glossary/#term-replica-set">副本集</a></li>
<li>authSource: 指定与用户凭据关联的数据库名称。<a href="https://docs.mongodb.com/manual/reference/connection-string/#urioption.authSource"><code>authSource</code></a> 默认为连接字符串中指定的数据库。</li>
</ul>
<ol start="2">
<li>读写分离:</li>
</ol>
<p>要实现读写分离,需要在 Connection String URI 的<strong>options</strong>里添加<code>readPreference=secondaryPreferred</code>,设置读请求为 Secondary 节点优先。<br>更多读选项请参考<a href="https://docs.mongodb.com/manual/core/read-preference/">Read preferences</a>。</p></summary>
<category term="mongodb" scheme="https://luochenxi.github.io/categories/mongodb/"/>
<category term="mongo" scheme="https://luochenxi.github.io/tags/mongo/"/>
<category term="pymongo" scheme="https://luochenxi.github.io/tags/pymongo/"/>
</entry>
<entry>
<title>免密码登录的原理和配置</title>
<link href="https://luochenxi.github.io/2019/07/22/yuque/%E5%85%8D%E5%AF%86%E7%A0%81%E7%99%BB%E5%BD%95%E7%9A%84%E5%8E%9F%E7%90%86%E5%92%8C%E9%85%8D%E7%BD%AE/"/>
<id>https://luochenxi.github.io/2019/07/22/yuque/%E5%85%8D%E5%AF%86%E7%A0%81%E7%99%BB%E5%BD%95%E7%9A%84%E5%8E%9F%E7%90%86%E5%92%8C%E9%85%8D%E7%BD%AE/</id>
<published>2019-07-22T12:58:24.000Z</published>
<updated>2020-08-25T10:08:52.372Z</updated>
<content type="html"><注: 谁要登录 谁产生 eg: A 要登录B A自己产生密钥对步骤:1: ssh-keygen -t rsa<p>2: ssh-copy-id -i .ssh/id_rsa.pub root@test111</p>]]></content>
<summary type="html"><meta name="referrer" content="no-referrer"/>
:</span></span><br><span class="line"> <span class="string">"""</span></span><br><span class="line"><span class="string"> 128位二进制字符串特征值转list,指定字节数</span></span><br><span class="line"><span class="string"> :param origin: "10100011011100101100101001100010101101111101101100101010100011001001011100100000011111011111110110001100111000010101001010110001"</span></span><br><span class="line"><span class="string"> :return: [ 163, 114, 202, 98, 183, 219, 42, 140, 151, 32, 125, 253, 140, 225, 82, 177 ]</span></span><br><span class="line"><span class="string"> """</span></span><br><span class="line"> result = [int(code[i:i + byte_n], <span class="number">2</span>) <span class="keyword">for</span> i <span class="keyword">in</span> range(len(code)) <span class="keyword">if</span> i % byte_n == <span class="number">0</span>]</span><br><span class="line"> <span class="keyword">return</span> result</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">int2bin</span>(<span class="params">n, count=<span class="number">24</span></span>):</span></span><br><span class="line"> <span class="string">"""</span></span><br><span class="line"><span class="string"> returns the binary of integer n, using count number of digits</span></span><br><span class="line"><span class="string"> """</span></span><br><span class="line"> <span class="keyword">return</span> <span class="string">""</span>.join([str((n >> y) & <span class="number">1</span>) <span class="keyword">for</span> y <span class="keyword">in</span> range(count - <span class="number">1</span>, <span class="number">-1</span>, <span class="number">-1</span>)])</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">unconvert_md5</span>(<span class="params">origin,byte_n=<span class="number">8</span></span>):</span></span><br><span class="line"> <span class="string">"""</span></span><br><span class="line"><span class="string"> 特征列表转化为128字符串</span></span><br><span class="line"><span class="string"> :param origin: [ 163, 114, 202, 98, 183, 219, 42, 140, 151, 32, 125, 253, 140, 225, 82, 177 ]</span></span><br><span class="line"><span class="string"> :return: '10100011011100101100101001100010101101111101101100101010100011001001011100100000011111011111110110001100111000010101001010110001'</span></span><br><span class="line"><span class="string"> """</span></span><br><span class="line"> <span class="keyword">return</span> <span class="string">""</span>.join([str(int2bin(i, <span class="number">8</span>)) <span class="keyword">for</span> i <span class="keyword">in</span> origin])</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><p>date: 2017-01-02 00:00:00<br>tags: [python,bytes_conversion]<br>categories:PYTHON</p>
<hr>
<h4 id="十进制,二进制转换"><a href="#十进制,二进制转换" class=</summary>
</entry>
<entry>
<title>在golang中用mgo实现多数据源连接mongodb</title>
<link href="https://luochenxi.github.io/2019/05/09/yuque/%E5%9C%A8golang%E4%B8%AD%E7%94%A8mgo%E5%AE%9E%E7%8E%B0%E5%A4%9A%E6%95%B0%E6%8D%AE%E6%BA%90%E8%BF%9E%E6%8E%A5mongodb/"/>
<id>https://luochenxi.github.io/2019/05/09/yuque/%E5%9C%A8golang%E4%B8%AD%E7%94%A8mgo%E5%AE%9E%E7%8E%B0%E5%A4%9A%E6%95%B0%E6%8D%AE%E6%BA%90%E8%BF%9E%E6%8E%A5mongodb/</id>
<published>2019-05-08T16:00:00.000Z</published>
<updated>2020-08-25T10:08:52.480Z</updated>
<content type="html"><![CDATA[<p>前言:<br>在 golang 中用 mgo 实现多数据源连接 mongodb,达到主备连接,起到容灾的作用。</p><h2 id="Code"><a href="#Code" class="headerlink" title="Code"></a>Code</h2><h3 id="utilpool"><a href="#utilpool" class="headerlink" title="utilpool"></a>utilpool</h3><a id="more"></a><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br><span class="line">179</span><br><span class="line">180</span><br><span class="line">181</span><br><span class="line">182</span><br><span class="line">183</span><br><span class="line">184</span><br><span class="line">185</span><br><span class="line">186</span><br><span class="line">187</span><br><span class="line">188</span><br><span class="line">189</span><br><span class="line">190</span><br><span class="line">191</span><br><span class="line">192</span><br><span class="line">193</span><br><span class="line">194</span><br><span class="line">195</span><br><span class="line">196</span><br><span class="line">197</span><br><span class="line">198</span><br><span class="line">199</span><br><span class="line">200</span><br><span class="line">201</span><br><span class="line">202</span><br><span class="line">203</span><br><span class="line">204</span><br><span class="line">205</span><br><span class="line">206</span><br><span class="line">207</span><br><span class="line">208</span><br><span class="line">209</span><br><span class="line">210</span><br><span class="line">211</span><br><span class="line">212</span><br><span class="line">213</span><br><span class="line">214</span><br><span class="line">215</span><br><span class="line">216</span><br><span class="line">217</span><br><span class="line">218</span><br><span class="line">219</span><br><span class="line">220</span><br><span class="line">221</span><br><span class="line">222</span><br><span class="line">223</span><br><span class="line">224</span><br><span class="line">225</span><br><span class="line">226</span><br><span class="line">227</span><br><span class="line">228</span><br><span class="line">229</span><br><span class="line">230</span><br><span class="line">231</span><br><span class="line">232</span><br><span class="line">233</span><br><span class="line">234</span><br><span class="line">235</span><br><span class="line">236</span><br><span class="line">237</span><br><span class="line">238</span><br><span class="line">239</span><br><span class="line">240</span><br><span class="line">241</span><br><span class="line">242</span><br><span class="line">243</span><br><span class="line">244</span><br><span class="line">245</span><br><span class="line">246</span><br><span class="line">247</span><br><span class="line">248</span><br><span class="line">249</span><br><span class="line">250</span><br><span class="line">251</span><br><span class="line">252</span><br><span class="line">253</span><br><span class="line">254</span><br><span class="line">255</span><br><span class="line">256</span><br><span class="line">257</span><br><span class="line">258</span><br><span class="line">259</span><br><span class="line">260</span><br><span class="line">261</span><br><span class="line">262</span><br><span class="line">263</span><br><span class="line">264</span><br><span class="line">265</span><br><span class="line">266</span><br><span class="line">267</span><br><span class="line">268</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// mongodb.go</span></span><br><span class="line"><span class="keyword">package</span> db</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">"log"</span></span><br><span class="line"><span class="string">"time"</span></span><br><span class="line"></span><br><span class="line"><span class="string">"github.com/globalsign/mgo"</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">const</span> (</span><br><span class="line">dbhost_master = <span class="string">"127.0.0.1:27017"</span></span><br><span class="line">dbhost_slave = <span class="string">"127.0.0.2:27017"</span></span><br><span class="line">authdb = <span class="string">"admin"</span></span><br><span class="line">authuser = <span class="string">"user"</span></span><br><span class="line">authpass = <span class="string">"123456"</span></span><br><span class="line">timeout = <span class="number">60</span> * time.Second</span><br><span class="line">poollimit = <span class="number">4096</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> globalS *mgo.Session</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">init</span><span class="params">()</span></span> {</span><br><span class="line">dialInfo := &mgo.DialInfo{</span><br><span class="line">Addrs: []<span class="keyword">string</span>{dbhost_master,dbhost_slave},</span><br><span class="line">Timeout: timeout,</span><br><span class="line">Source: authdb,</span><br><span class="line">Username: authuser,</span><br><span class="line">Password: authpass,</span><br><span class="line">PoolLimit: poollimit,</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">s, err := mgo.DialWithInfo(dialInfo)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">log.Fatalf(<span class="string">"Create Session: %s\n"</span>, err)</span><br><span class="line">}</span><br><span class="line">globalS = s</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">connect</span><span class="params">(db, collection <span class="keyword">string</span>)</span> <span class="params">(*mgo.Session, *mgo.Collection)</span></span> {</span><br><span class="line">ms := globalS.Copy()</span><br><span class="line">c := ms.DB(db).C(collection)</span><br><span class="line">ms.SetMode(mgo.Monotonic, <span class="literal">true</span>)</span><br><span class="line"><span class="keyword">return</span> ms, c</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">getDb</span><span class="params">(db <span class="keyword">string</span>)</span> <span class="params">(*mgo.Session, *mgo.Database)</span></span> {</span><br><span class="line">ms := globalS.Copy()</span><br><span class="line"><span class="keyword">return</span> ms, ms.DB(db)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">IsEmpty</span><span class="params">(db, collection <span class="keyword">string</span>)</span> <span class="title">bool</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">count, err := c.Count()</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">log.Fatal(err)</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> count == <span class="number">0</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Count</span><span class="params">(db, collection <span class="keyword">string</span>, query <span class="keyword">interface</span>{})</span> <span class="params">(<span class="keyword">int</span>, error)</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"><span class="keyword">return</span> c.Find(query).Count()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Insert</span><span class="params">(db, collection <span class="keyword">string</span>, docs ...<span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> c.Insert(docs...)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">FindOne</span><span class="params">(db, collection <span class="keyword">string</span>, query, selector, result <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> c.Find(query).Select(selector).One(result)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">FindAll</span><span class="params">(db, collection <span class="keyword">string</span>, query, selector, result <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> c.Find(query).Select(selector).All(result)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">FindPage</span><span class="params">(db, collection <span class="keyword">string</span>, page, limit <span class="keyword">int</span>, query, selector, result <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> c.Find(query).Select(selector).Skip(page * limit).Limit(limit).All(result)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">FindIter</span><span class="params">(db, collection <span class="keyword">string</span>, query <span class="keyword">interface</span>{})</span> *<span class="title">mgo</span>.<span class="title">Iter</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> c.Find(query).Iter()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Update</span><span class="params">(db, collection <span class="keyword">string</span>, selector, update <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> c.Update(selector, update)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Upsert</span><span class="params">(db, collection <span class="keyword">string</span>, selector, update <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line">_, err := c.Upsert(selector, update)</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">UpdateAll</span><span class="params">(db, collection <span class="keyword">string</span>, selector, update <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line">_, err := c.UpdateAll(selector, update)</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Remove</span><span class="params">(db, collection <span class="keyword">string</span>, selector <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> c.Remove(selector)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">RemoveAll</span><span class="params">(db, collection <span class="keyword">string</span>, selector <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line">_, err := c.RemoveAll(selector)</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">//insert one or multi documents</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">BulkInsert</span><span class="params">(db, collection <span class="keyword">string</span>, docs ...<span class="keyword">interface</span>{})</span> <span class="params">(*mgo.BulkResult, error)</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">bulk := c.Bulk()</span><br><span class="line">bulk.Insert(docs...)</span><br><span class="line"><span class="keyword">return</span> bulk.Run()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">BulkRemove</span><span class="params">(db, collection <span class="keyword">string</span>, selector ...<span class="keyword">interface</span>{})</span> <span class="params">(*mgo.BulkResult, error)</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"></span><br><span class="line">bulk := c.Bulk()</span><br><span class="line">bulk.Remove(selector...)</span><br><span class="line"><span class="keyword">return</span> bulk.Run()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">BulkRemoveAll</span><span class="params">(db, collection <span class="keyword">string</span>, selector ...<span class="keyword">interface</span>{})</span> <span class="params">(*mgo.BulkResult, error)</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">bulk := c.Bulk()</span><br><span class="line">bulk.RemoveAll(selector...)</span><br><span class="line"><span class="keyword">return</span> bulk.Run()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">BulkUpdate</span><span class="params">(db, collection <span class="keyword">string</span>, pairs ...<span class="keyword">interface</span>{})</span> <span class="params">(*mgo.BulkResult, error)</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">bulk := c.Bulk()</span><br><span class="line">bulk.Update(pairs...)</span><br><span class="line"><span class="keyword">return</span> bulk.Run()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">BulkUpdateAll</span><span class="params">(db, collection <span class="keyword">string</span>, pairs ...<span class="keyword">interface</span>{})</span> <span class="params">(*mgo.BulkResult, error)</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">bulk := c.Bulk()</span><br><span class="line">bulk.UpdateAll(pairs...)</span><br><span class="line"><span class="keyword">return</span> bulk.Run()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">BulkUpsert</span><span class="params">(db, collection <span class="keyword">string</span>, pairs ...<span class="keyword">interface</span>{})</span> <span class="params">(*mgo.BulkResult, error)</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">bulk := c.Bulk()</span><br><span class="line">bulk.Upsert(pairs...)</span><br><span class="line"><span class="keyword">return</span> bulk.Run()</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">PipeAll</span><span class="params">(db, collection <span class="keyword">string</span>, pipeline, result <span class="keyword">interface</span>{}, allowDiskUse <span class="keyword">bool</span>)</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"><span class="keyword">var</span> pipe *mgo.Pipe</span><br><span class="line"><span class="keyword">if</span> allowDiskUse {</span><br><span class="line">pipe = c.Pipe(pipeline).AllowDiskUse()</span><br><span class="line">} <span class="keyword">else</span> {</span><br><span class="line">pipe = c.Pipe(pipeline)</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> pipe.All(result)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">PipeOne</span><span class="params">(db, collection <span class="keyword">string</span>, pipeline, result <span class="keyword">interface</span>{}, allowDiskUse <span class="keyword">bool</span>)</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"><span class="keyword">var</span> pipe *mgo.Pipe</span><br><span class="line"><span class="keyword">if</span> allowDiskUse {</span><br><span class="line">pipe = c.Pipe(pipeline).AllowDiskUse()</span><br><span class="line">} <span class="keyword">else</span> {</span><br><span class="line">pipe = c.Pipe(pipeline)</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> pipe.One(result)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">PipeIter</span><span class="params">(db, collection <span class="keyword">string</span>, pipeline <span class="keyword">interface</span>{}, allowDiskUse <span class="keyword">bool</span>)</span> *<span class="title">mgo</span>.<span class="title">Iter</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line"><span class="keyword">var</span> pipe *mgo.Pipe</span><br><span class="line"><span class="keyword">if</span> allowDiskUse {</span><br><span class="line">pipe = c.Pipe(pipeline).AllowDiskUse()</span><br><span class="line">} <span class="keyword">else</span> {</span><br><span class="line">pipe = c.Pipe(pipeline)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> pipe.Iter()</span><br><span class="line"></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Explain</span><span class="params">(db, collection <span class="keyword">string</span>, pipeline, result <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, c := connect(db, collection)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">pipe := c.Pipe(pipeline)</span><br><span class="line"><span class="keyword">return</span> pipe.Explain(result)</span><br><span class="line">}</span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">GridFSCreate</span><span class="params">(db, prefix, name <span class="keyword">string</span>)</span> <span class="params">(*mgo.GridFile, error)</span></span> {</span><br><span class="line">ms, d := getDb(db)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">gridFs := d.GridFS(prefix)</span><br><span class="line"><span class="keyword">return</span> gridFs.Create(name)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">GridFSFindOne</span><span class="params">(db, prefix <span class="keyword">string</span>, query, result <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, d := getDb(db)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">gridFs := d.GridFS(prefix)</span><br><span class="line"><span class="keyword">return</span> gridFs.Find(query).One(result)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">GridFSFindAll</span><span class="params">(db, prefix <span class="keyword">string</span>, query, result <span class="keyword">interface</span>{})</span> <span class="title">error</span></span> {</span><br><span class="line">ms, d := getDb(db)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">gridFs := d.GridFS(prefix)</span><br><span class="line"><span class="keyword">return</span> gridFs.Find(query).All(result)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">GridFSOpen</span><span class="params">(db, prefix, name <span class="keyword">string</span>)</span> <span class="params">(*mgo.GridFile, error)</span></span> {</span><br><span class="line">ms, d := getDb(db)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">gridFs := d.GridFS(prefix)</span><br><span class="line"><span class="keyword">return</span> gridFs.Open(name)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">GridFSRemove</span><span class="params">(db, prefix, name <span class="keyword">string</span>)</span> <span class="title">error</span></span> {</span><br><span class="line">ms, d := getDb(db)</span><br><span class="line"><span class="keyword">defer</span> ms.Close()</span><br><span class="line">gridFs := d.GridFS(prefix)</span><br><span class="line"><span class="keyword">return</span> gridFs.Remove(name)</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="TestCode"><a href="#TestCode" class="headerlink" title="TestCode"></a>TestCode</h3><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// drop.go</span></span><br><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">"fmt"</span></span><br><span class="line">db <span class="string">"mongo/mongodb"</span></span><br><span class="line"><span class="string">"time"</span></span><br><span class="line"></span><br><span class="line"><span class="string">"github.com/globalsign/mgo/bson"</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">type</span> Data <span class="keyword">struct</span> {</span><br><span class="line">Id bson.ObjectId <span class="string">`bson:"_id"`</span></span><br><span class="line">Title <span class="keyword">string</span> <span class="string">`bson:"title"`</span></span><br><span class="line">Des <span class="keyword">string</span> <span class="string">`bson:"des"`</span></span><br><span class="line">Content <span class="keyword">string</span> <span class="string">`bson:"content"`</span></span><br><span class="line">Date time.Time <span class="string">`bson:"date"`</span></span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">const</span> (</span><br><span class="line">database = <span class="string">"Test"</span></span><br><span class="line">collection = <span class="string">"TestModel"</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Examples : the operation of mgo to MongoDB</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> {</span><br><span class="line"><span class="comment">//insert one document</span></span><br><span class="line">data := &Data{</span><br><span class="line">Id: bson.NewObjectId(),</span><br><span class="line">Title: <span class="string">"博客的标题 1"</span>,</span><br><span class="line">Des: <span class="string">"博客描述信息 1"</span>,</span><br><span class="line">Content: <span class="string">"博客的具体内容 1"</span>,</span><br><span class="line">Date: time.Now(),</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">err := db.Insert(database, collection, data)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Println(<span class="string">"insert one doc"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// find one with all fields</span></span><br><span class="line"><span class="keyword">var</span> result Data</span><br><span class="line">err = db.FindOne(database, collection, bson.M{<span class="string">"_id"</span>: bson.ObjectIdHex(<span class="string">"5b3db2334d661ff46ee14b9c"</span>)}, <span class="literal">nil</span>, &result)</span><br><span class="line">fmt.Println(<span class="string">"find one with all fields"</span>, result)</span><br><span class="line"></span><br><span class="line"><span class="comment">// find one without id field</span></span><br><span class="line"><span class="keyword">var</span> result1 Data</span><br><span class="line">err = db.FindOne(database, collection, bson.M{<span class="string">"_id"</span>: bson.ObjectIdHex(<span class="string">"5b3db2334d661ff46ee14b9c"</span>)}, bson.M{<span class="string">"_id"</span>: <span class="number">0</span>}, &result1)</span><br><span class="line">fmt.Println(<span class="string">"find one without id field"</span>, result1)</span><br><span class="line"></span><br><span class="line"><span class="comment">//find all documents</span></span><br><span class="line"><span class="keyword">var</span> allResult []Data</span><br><span class="line">err = db.FindAll(database, collection, <span class="literal">nil</span>, <span class="literal">nil</span>, &allResult)</span><br><span class="line">fmt.Println(<span class="string">"find all docs"</span>, allResult)</span><br><span class="line"></span><br><span class="line"><span class="comment">// find all documents with query and selector</span></span><br><span class="line"><span class="keyword">var</span> allResult1 []Data</span><br><span class="line">err = db.FindAll(database, collection, bson.M{<span class="string">"title"</span>: <span class="string">"博客的标题 1"</span>}, bson.M{<span class="string">"_id"</span>: <span class="number">0</span>}, &allResult1)</span><br><span class="line">fmt.Println(<span class="string">"find all docs with query and selector"</span>, allResult1)</span><br><span class="line"></span><br><span class="line"><span class="comment">//find documents with page and limit</span></span><br><span class="line"><span class="keyword">var</span> resultWithPage []Data</span><br><span class="line">err = db.FindPage(database, collection, <span class="number">0</span>, <span class="number">4</span>, <span class="literal">nil</span>, bson.M{<span class="string">"_id"</span>: <span class="number">0</span>}, &resultWithPage)</span><br><span class="line">fmt.Println(<span class="string">"find docs with page and limit"</span>, resultWithPage)</span><br><span class="line"></span><br><span class="line"><span class="comment">//find the cursor</span></span><br><span class="line"><span class="keyword">var</span> iterAll []Data</span><br><span class="line">iter := db.FindIter(database, collection, <span class="literal">nil</span>)</span><br><span class="line">err = iter.All(&iterAll)</span><br><span class="line">fmt.Println(<span class="string">"find cursor "</span>, iterAll)</span><br><span class="line"></span><br><span class="line"><span class="comment">//update one document</span></span><br><span class="line">err = db.Update(database, collection, bson.M{<span class="string">"_id"</span>: bson.ObjectIdHex(<span class="string">"5b3db2334d661ff46ee14b9c"</span>)}, bson.M{<span class="string">"$set"</span>: bson.M{</span><br><span class="line"><span class="string">"title"</span>: <span class="string">"更新后的标题"</span>,</span><br><span class="line"><span class="string">"des"</span>: <span class="string">"更新后的描述信息"</span>,</span><br><span class="line"><span class="string">"date"</span>: time.Now(),</span><br><span class="line">}})</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Println(<span class="string">"upate one error"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">//update all docments</span></span><br><span class="line"><span class="comment">/*err = db.UpdateAll(database, collection, nil, bson.M{"$set": bson.M{</span></span><br><span class="line"><span class="comment">"title": "更新所有的标题",</span></span><br><span class="line"><span class="comment">"date": time.Now(),</span></span><br><span class="line"><span class="comment">}})</span></span><br><span class="line"><span class="comment">if err != nil {</span></span><br><span class="line"><span class="comment">fmt.Println("update all docs error ", err)</span></span><br><span class="line"><span class="comment">}*/</span></span><br><span class="line"></span><br><span class="line"><span class="comment">//delete one docment</span></span><br><span class="line">err = db.Remove(database, collection, bson.M{<span class="string">"_id"</span>: bson.ObjectIdHex(<span class="string">"5b3db2334d661ff46ee14b99"</span>)})</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Println(<span class="string">"remove one doc error"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">//upsert the docment</span></span><br><span class="line">err = db.Upsert(database, collection, bson.M{<span class="string">"title"</span>: <span class="string">"Title Upsert"</span>}, bson.M{<span class="string">"$set"</span>: bson.M{</span><br><span class="line"><span class="string">"des"</span>: <span class="string">"描述Upsert"</span>,</span><br><span class="line"><span class="string">"date"</span>: time.Now(),</span><br><span class="line"><span class="string">"Content"</span>: <span class="string">"内容Upsert"</span>,</span><br><span class="line">}})</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line">fmt.Println(<span class="string">"upsert docment error"</span>, err)</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">//bulk insert docments</span></span><br><span class="line">d1 := &Data{</span><br><span class="line">Id: bson.NewObjectId(),</span><br><span class="line">Title: <span class="string">"bulk title"</span>,</span><br><span class="line">Des: <span class="string">"bulk Des"</span>,</span><br><span class="line">Content: <span class="string">"bulk content"</span>,</span><br><span class="line">Date: time.Now(),</span><br><span class="line">}</span><br><span class="line">d2 := &Data{</span><br><span class="line">Id: bson.NewObjectId(),</span><br><span class="line">Title: <span class="string">"bulk title"</span>,</span><br><span class="line">Des: <span class="string">"bulk Des"</span>,</span><br><span class="line">Content: <span class="string">"bulk content"</span>,</span><br><span class="line">Date: time.Now(),</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">insertResult, _ := db.BulkInsert(database, collection, d1, d2)</span><br><span class="line">fmt.Println(<span class="string">"bulk insert docs"</span>, insertResult)</span><br><span class="line"></span><br><span class="line"><span class="comment">//bulk update</span></span><br><span class="line">up1 := bson.M{<span class="string">"title"</span>: <span class="string">"bulk update title"</span>}</span><br><span class="line">up2 := bson.M{<span class="string">"$set"</span>: bson.M{<span class="string">"title"</span>: <span class="string">"bulk update title"</span>}}</span><br><span class="line"></span><br><span class="line">up3 := bson.M{<span class="string">"_id"</span>: bson.ObjectIdHex(<span class="string">"5b3dbd7a9d5e3e314c93d150"</span>)}</span><br><span class="line">up4 := bson.M{<span class="string">"$set"</span>: bson.M{<span class="string">"des"</span>: <span class="string">"bulk update des"</span>}}</span><br><span class="line"></span><br><span class="line">updateResult, _ := db.BulkUpdate(database, collection, up1, up2, up3, up4)</span><br><span class="line">fmt.Println(<span class="string">"bulk update result"</span>, updateResult)</span><br><span class="line"></span><br><span class="line"><span class="comment">//bulk update all</span></span><br><span class="line">up5 := bson.M{<span class="string">"title"</span>: <span class="string">"bulk title"</span>}</span><br><span class="line">up6 := bson.M{<span class="string">"$set"</span>: bson.M{<span class="string">"title"</span>: <span class="string">"bulk update title"</span>}}</span><br><span class="line"></span><br><span class="line">updateAllResult, _ := db.BulkUpdateAll(database, collection, up5, up6)</span><br><span class="line">fmt.Println(<span class="string">"bulk update result"</span>, updateAllResult)</span><br><span class="line"></span><br><span class="line">}</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><p>前言:<br>在 golang 中用 mgo 实现多数据源连接 mongodb,达到主备连接,起到容灾的作用。</p>
<h2 id="Code"><a href="#Code" class="headerlink" title="Code"></a>Code</h2><h3 id="utilpool"><a href="#utilpool" class="headerlink" title="utilpool"></a>utilpool</h3></summary>
<category term="go" scheme="https://luochenxi.github.io/categories/go/"/>
<category term="go" scheme="https://luochenxi.github.io/tags/go/"/>
<category term="mgo" scheme="https://luochenxi.github.io/tags/mgo/"/>
<category term="mongodb" scheme="https://luochenxi.github.io/tags/mongodb/"/>
</entry>
<entry>
<title>理解Jaccard相似度,Minhash和LSH算法</title>
<link href="https://luochenxi.github.io/2019/03/28/yuque/%E7%90%86%E8%A7%A3Jaccard%E7%9B%B8%E4%BC%BC%E5%BA%A6,Minhash%E5%92%8CLSH%E7%AE%97%E6%B3%95/"/>
<id>https://luochenxi.github.io/2019/03/28/yuque/%E7%90%86%E8%A7%A3Jaccard%E7%9B%B8%E4%BC%BC%E5%BA%A6,Minhash%E5%92%8CLSH%E7%AE%97%E6%B3%95/</id>
<published>2019-03-28T07:48:00.000Z</published>
<updated>2020-08-25T10:08:52.640Z</updated>
<content type="html"><![CDATA[<p>前言:<br>记录一下学习 Jaccard 相似度,Minhash 和 LSH 算法的笔记,方便以后查找。</p><h3 id="Jaccard-相似度-狭义"><a href="#Jaccard-相似度-狭义" class="headerlink" title="Jaccard 相似度[狭义]"></a>Jaccard 相似度[狭义]</h3><p>理解有限,广义就不深入了。<br>判断两个集合是否相等,一般使用称之为 Jaccard 相似度的算法(后面用 Jac(S,S)来表示集合 S 和 S 的 Jaccard 相似度)。举个列子,集合 X = {a,b,c},Y = {b,c,d}。那么 Jac(X,Y) = 2 / 3 = 0.67。也就是说,结合 X 和 Y 有 67%的元素相同。下面是形式的表述 Jaccard 相似度公式:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Jac(X,Y) = |X∩Y| / |X∪Y|</span><br></pre></td></tr></table></figure><p>也就是两个结合交集(∩)的个数比上两个集合并集(∪)的个数。范围在[0,1]之间。</p><a id="more"></a><h4 id="由相似度,可以转换成-Jaccard-距离:"><a href="#由相似度,可以转换成-Jaccard-距离:" class="headerlink" title="由相似度,可以转换成 Jaccard 距离:"></a>由相似度,可以转换成 Jaccard 距离:</h4><p>Jaccard distance (A, B) = 1 - Jaccard(A, B)</p><h3 id="降维技术-Minhash"><a href="#降维技术-Minhash" class="headerlink" title="降维技术 Minhash:"></a>降维技术 Minhash:</h3><p>解决的是原始高维计算时间太长,将原始集合<strong>压缩</strong>成更小的集合,而且又<strong>不失去相似性</strong>,可以<strong>缩短</strong>计算时间。</p><h3 id="LSH-–-局部敏感哈希"><a href="#LSH-–-局部敏感哈希" class="headerlink" title="LSH – 局部敏感哈希:"></a><strong>LSH – 局部敏感哈希:</strong></h3><p>场景:用于海量高维数据的近似最近邻快速查找技术。<br>基本思路:将相似的集合聚集到一起,减小查找范围,避免比较不相似的集合。</p><p>这种类似索引的技术来加快查找过程,通常这类技术称为<strong>最近邻查找</strong>(Nearest Neighbor,AN),例如 K-d tree;或<strong>近似最近邻查找</strong>(Approximate Nearest Neighbor, ANN),例如 K-d tree with BBF, Randomized Kd-trees, Hierarchical K-means Tree。而 LSH 是 ANN 中的一类方法。</p><p>基于概率的分桶方法当然会有漏网之鱼,我们希望下面两种情况的集合越少越好:</p><ul><li>False Positives: 相似度很低的两个集合被哈希到同一个桶内;</li><li>False Negatives: 真正相似的集合在每一个 band 上都没有被哈希到同一个桶内。</li></ul><h3 id="Cell-probe-方法"><a href="#Cell-probe-方法" class="headerlink" title="Cell-probe 方法"></a>Cell-probe 方法</h3><p>加速查找的典型方法是对数据集进行划分,我们采用了基于 Multi-probing(best-bin KD 树变体)的分块方法。</p><ul><li>特征空间被切分为 ncells 个块</li><li>数据被划分到这些块中(k-means 可根据最近欧式距离),归属关系存储在 ncells 个节点的倒排列表中</li><li>搜索时,检索离目标距离最近的 nprobe 个块</li><li>根据倒排列表检索 nprobe 个块中的所有数据。</li></ul><p>这便是 IndexIVFFlat,它需要另一个索引来记录倒排列表。<br>IndexIVFKmeans 和 IndexIVFSphericalKmeans 不是对象而是方法,它们可以返回 IndexIVFFlat 对象。<br><strong>注意</strong>:<em>对于高维的数据,要达到较好的召回,需要的 nprobes 可能很大。</em></p><h3 id="和-LSH-的关系"><a href="#和-LSH-的关系" class="headerlink" title="和 LSH 的关系"></a>和 LSH 的关系</h3><p>最流行的 cell-probe 方法可能是原生的 LSH 方法,可参考<a href="http://www.mit.edu/~andoni/LSH/">E2LSH</a>。然而,这个方法及其变体有两大弊端:</p><ul><li>需要大量的哈希函数(=分块数),来达到可以接受的结果。</li><li>哈希函数很难基于输入动态调整,实际应用中容易返回次优结果。</li></ul><h3 id="Faiss-示例"><a href="#Faiss-示例" class="headerlink" title="Faiss 示例"></a>Faiss 示例</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># d是输入数据的维度,nbits是存储向量的bits数目。</span></span><br><span class="line">n_bits = <span class="number">2</span> * d</span><br><span class="line">lsh = faiss.IndexLSH (d, n_bits)</span><br><span class="line">lsh.train (x_train)</span><br><span class="line">lsh.add (x_base)</span><br><span class="line">D, I = lsh.search (x_query, k)</span><br></pre></td></tr></table></figure><p><strong>注意:</strong>算法<strong>不是</strong> vanilla-LSH,而是更好的选择。如果 n_bits <= d,则使用正交投影仪组,如果 n_bits> d,则使用紧密帧。</p><h3 id="小结:"><a href="#小结:" class="headerlink" title="小结:"></a>小结:</h3><p><strong>MinHash 降维,LSH 减少查找范围。</strong></p><h3 id="以上内容摘自链接如下:"><a href="#以上内容摘自链接如下:" class="headerlink" title="以上内容摘自链接如下:"></a>以上内容摘自链接如下:</h3><p>强烈推荐阅读一下链接,加深理解。</p><ul><li><a href="http://www.cnblogs.com/bourneli/archive/2013/04/04/2999767.html">利用 Minhash 和 LSH 寻找相似的集合</a></li><li><a href="https://github.com/facebookresearch/faiss/wiki/Faiss-indexes">Faiss 索引</a></li></ul>]]></content>
<summary type="html"><p>前言:<br>记录一下学习 Jaccard 相似度,Minhash 和 LSH 算法的笔记,方便以后查找。</p>
<h3 id="Jaccard-相似度-狭义"><a href="#Jaccard-相似度-狭义" class="headerlink" title="Jaccard 相似度[狭义]"></a>Jaccard 相似度[狭义]</h3><p>理解有限,广义就不深入了。<br>判断两个集合是否相等,一般使用称之为 Jaccard 相似度的算法(后面用 Jac(S,S)来表示集合 S 和 S 的 Jaccard 相似度)。举个列子,集合 X = {a,b,c},Y = {b,c,d}。那么 Jac(X,Y) = 2 / 3 = 0.67。也就是说,结合 X 和 Y 有 67%的元素相同。下面是形式的表述 Jaccard 相似度公式:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Jac(X,Y) &#x3D; |X∩Y| &#x2F; |X∪Y|</span><br></pre></td></tr></table></figure>
<p>也就是两个结合交集(∩)的个数比上两个集合并集(∪)的个数。范围在[0,1]之间。</p></summary>
<category term="Algorithm" scheme="https://luochenxi.github.io/categories/Algorithm/"/>
<category term="Algorithm" scheme="https://luochenxi.github.io/tags/Algorithm/"/>
</entry>
<entry>
<title>Faiss利用mkl加速,构建索引训练时出错</title>
<link href="https://luochenxi.github.io/2019/03/28/yuque/Faiss%E5%88%A9%E7%94%A8mkl%E5%8A%A0%E9%80%9F,%E6%9E%84%E5%BB%BA%E7%B4%A2%E5%BC%95%E8%AE%AD%E7%BB%83%E6%97%B6%E5%87%BA%E9%94%99/"/>
<id>https://luochenxi.github.io/2019/03/28/yuque/Faiss%E5%88%A9%E7%94%A8mkl%E5%8A%A0%E9%80%9F,%E6%9E%84%E5%BB%BA%E7%B4%A2%E5%BC%95%E8%AE%AD%E7%BB%83%E6%97%B6%E5%87%BA%E9%94%99/</id>
<published>2019-03-28T03:30:00.000Z</published>
<updated>2020-08-25T10:08:52.656Z</updated>
<content type="html"><![CDATA[<p>前言:<br>记录一下 faiss 构建索引训练时碰到的一个坑。<br>Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.</p><h4 id="问题:"><a href="#问题:" class="headerlink" title="问题:"></a>问题:</h4><p>利用英特尔 mkl(Math Kernel Library)库加速 faiss。 index.train()时报如下错误:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.</span><br></pre></td></tr></table></figure><a id="more"></a><h4 id="解决方案"><a href="#解决方案" class="headerlink" title="解决方案"></a>解决方案</h4><p>在调用 faiss 之前导入调用 mkl。代码如下:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> mkl</span><br><span class="line">mkl.get_max_threads()</span><br></pre></td></tr></table></figure><p>为什么这么做,我还不太理解。猜测是 conda 安装版本兼容的问题。具体可以看我提的<a href="https://github.com/facebookresearch/faiss/issues/753">issue</a></p><h4 id="补充"><a href="#补充" class="headerlink" title="补充"></a>补充</h4><p>如出现 mkl 导入失败的情况。如</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"> import mkl</span><br><span class="line">ImportError: No module named mkl</span><br></pre></td></tr></table></figure><p>解决方式如下:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">### 执行:</span></span><br><span class="line">$ conda install mkl</span><br><span class="line">$ conda install mkl-service</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><p>前言:<br>记录一下 faiss 构建索引训练时碰到的一个坑。<br>Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.</p>
<h4 id="问题:"><a href="#问题:" class="headerlink" title="问题:"></a>问题:</h4><p>利用英特尔 mkl(Math Kernel Library)库加速 faiss。 index.train()时报如下错误:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.</span><br></pre></td></tr></table></figure></summary>
<category term="Faiss" scheme="https://luochenxi.github.io/categories/Faiss/"/>
<category term="Faiss" scheme="https://luochenxi.github.io/tags/Faiss/"/>
<category term="MKL" scheme="https://luochenxi.github.io/tags/MKL/"/>
</entry>
<entry>
<title>理解乘积量化(Product Quantization)</title>
<link href="https://luochenxi.github.io/2019/03/25/yuque/%E7%90%86%E8%A7%A3%E4%B9%98%E7%A7%AF%E9%87%8F%E5%8C%96(Product%20Quantization)/"/>
<id>https://luochenxi.github.io/2019/03/25/yuque/%E7%90%86%E8%A7%A3%E4%B9%98%E7%A7%AF%E9%87%8F%E5%8C%96(Product%20Quantization)/</id>
<published>2019-03-25T08:21:00.000Z</published>
<updated>2020-08-25T10:08:52.716Z</updated>
<content type="html"><![CDATA[<p>前言:<br> 记录一下 PQ(乘积量化)的一些学习笔记以及在 faiss 中的运用,方便以后查找。</p><h3 id="应用场景:"><a href="#应用场景:" class="headerlink" title="应用场景:"></a>应用场景:</h3><p>解决的是海量数据场景下高维度特征向量数据的近似最近邻快速查找。</p><h3 id="Product-Quantization-算法"><a href="#Product-Quantization-算法" class="headerlink" title="Product Quantization 算法"></a>Product Quantization 算法</h3><p>Product Quantization 本质是将原始高维空间分解为有限数量的低维子空间的笛卡尔积,然后分别量化。与此同时也进行了内存空间的压缩。</p><a id="more"></a><h3 id="相似检索-距离计算"><a href="#相似检索-距离计算" class="headerlink" title="相似检索(距离计算)"></a>相似检索(距离计算)</h3><p>作者提供了两种距离计算方式,分别为 “对称距离计算”(SDC) 和 “非对称距离计算”(ADC) ,分别如下左右图所示:<br><img src="/" class="lazyload" data-src="https://cdn.nlark.com/yuque/0/2019/png/283236/1553501339375-7fc4f332-7792-4643-8608-300770ff6f6e.png#align=left&display=inline&height=194&margin=%5Bobject%20Object%5D&name=2018122015100935.png&originHeight=194&originWidth=395&size=23852&status=done&style=none&width=395" alt="2018122015100935.png"></p><ul><li>对称距离计算:直接使用两个压缩向量 x,y 的索引值所对应的码字 q(x),q(y)之间的距离代替之,而 q(x),q(y)之间的距离可以离线计算,因此可以把 q(x),q(y)之间的距离制作成查找表,只要按照压缩向量的索引值进行对应的查找就可以了,所以速度非常快;</li><li>非对称距离计算:使用 x,q(y)之间的距离代替 x,y 之间的距离,其中 x 是测试向量。虽然 y 的个数可能有上百万个,但是 q(y)的个数只有 k 个,对于每个 x,我们只需要在输入 x 之后先计算一遍 x 和 k 个 q(y)的距离,制成查找表(因为只有 k 个,所以速度是非常快的),然后按照 y 对应的压缩向量索引值进行取值操作就可以了。</li></ul><h4 id="小结:"><a href="#小结:" class="headerlink" title="小结:"></a>小结:</h4><p>不管哪种计算方法都可以实现快速的距离计算,但是非对称距离计算由于只量化了 y,所以计算的距离精度更高,效果也更好。距离计算过程中只需要存储码书和对应的索引值就可以完全抛弃原始的图像表达向量,实现数据的压缩和距离的快速计算。<br> 但是需要明白的是,这种算法是基于量化的,所以必然存在量化误差,所以距离的计算并不是完全准确的。通常通过这种算法迅速返回 N 个结果,然后再在 N 个结果中进行进一步的匹配计算,得到比较准确的结果。<br> OPQ 试图寻找一个正交矩阵,将原始矩阵旋转后再行分解,以使量化后的向量重建后,其误差最小。</p><h3 id="Faiss-PQ-的示例"><a href="#Faiss-PQ-的示例" class="headerlink" title="Faiss PQ 的示例"></a>Faiss PQ 的示例</h3><h4 id="PQ-示例"><a href="#PQ-示例" class="headerlink" title="PQ 示例"></a>PQ 示例</h4><p>d 是输入数据的维度,nbits 是存储向量的 bits 数目。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">m = <span class="number">16</span> <span class="comment"># number of subquantizers</span></span><br><span class="line">n_bits = <span class="number">8</span> <span class="comment"># bits allocated per subquantizer</span></span><br><span class="line">pq = faiss.IndexPQ (d, m, n_bits) <span class="comment"># Create the index</span></span><br><span class="line">pq.train (x_train) <span class="comment"># Training</span></span><br><span class="line">pq.add (x_base) <span class="comment"># Populate the index</span></span><br><span class="line">D, I = pq.search (x_query, k) <span class="comment"># Perform a search</span></span><br></pre></td></tr></table></figure><h4 id="IndexRefineFlat"><a href="#IndexRefineFlat" class="headerlink" title="IndexRefineFlat"></a>IndexRefineFlat</h4><p>对搜索结果进行精准重排序</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">q = faiss.IndexPQ(d, M, nbits_per_index) <span class="comment"># nbits_per_index: bits allocated per subquantizer</span></span><br><span class="line">rq = faiss.IndexRefineFlat(q) <span class="comment"># 对搜索结果进行精准重排序</span></span><br><span class="line">rq.train (xt)</span><br><span class="line">rq.add (xb)</span><br><span class="line">rq.k_factor = <span class="number">5</span><span class="comment"># Re-rank时要核实的数据量</span></span><br><span class="line">D, I = rq:search (xq, <span class="number">10</span>)</span><br></pre></td></tr></table></figure><h4 id="带倒排的-PQ:IndexIVFPQ"><a href="#带倒排的-PQ:IndexIVFPQ" class="headerlink" title="带倒排的 PQ:IndexIVFPQ"></a>带倒排的 PQ:IndexIVFPQ</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">coarse_quantizer = faiss.IndexFlatL2(d)</span><br><span class="line">index = faiss.IndexIVFPQ (coarse_quantizer, d,</span><br><span class="line"> ncentroids, M, nbits_per_index) <span class="comment"># ncentroids 类心数目</span></span><br><span class="line">index.nprobe = <span class="number">50</span> <span class="comment"># 控制速度和精度的折中,搜索时,检索离目标距离最近的nprobe个块</span></span><br></pre></td></tr></table></figure><p>更多使用 faiss 姿势参考:<br><a href="https://github.com/facebookresearch/faiss/blob/master/tests/test_index_accuracy.py#L62">https://github.com/facebookresearch/faiss/blob/master/tests/test_index_accuracy.py#L62</a></p><h4 id="小结:摘自ANN-中乘积量化与多维倒排小结"><a href="#小结:摘自ANN-中乘积量化与多维倒排小结" class="headerlink" title="小结:摘自ANN 中乘积量化与多维倒排小结"></a>小结:摘自<a href="https://www.cnblogs.com/dingz/p/10519420.html">ANN 中乘积量化与多维倒排小结</a></h4><ol><li>向量量化是一种有损压缩。</li><li>乘积量化中子空间不一定越多越好,要平衡计算复杂度和量化精度.</li><li>类心越多,量化失真(distortion)越小,计算成本也会相应增强。类心数目(centroid)是实际中常调整的超参。</li><li>乘积量化有个前提假设,两个子空间(subspace)独立。但实际上大多数不是这样,这里引出了 OPQ 的优化。<ul><li>OPQ(Ge T, He K, Ke Q, et al. Optimized Product Quantization[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2013, 36(4):744-755.),是针对 PQ 中子空间存在相关性的优化。主要内容是添加旋转矩阵作用于字典(codebook),并依次迭代 R 和聚类,使得最终的量化损失最小。</li><li>LOPQ(Kalantidis Y , Avrithis Y . Locally Optimized Product Quantization for Approximate Nearest Neighbor Search[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2014.)在 OPQ 的基础上,加入了每个子空间的各自旋转矩阵。下图展示了不同量化方法下的类心分布。</li></ul></li></ol><p><img src="/" class="lazyload" data-src="https://cdn.nlark.com/yuque/0/2019/png/283236/1553515802596-7165462a-004f-4188-aa16-10da6654e13d.png#align=left&display=inline&height=603&margin=%5Bobject%20Object%5D&name=1398176.png&originHeight=1823&originWidth=2255&size=651330&status=done&style=none&width=746" alt="1398176.png"></p><h3 id="以上内容参考摘自:"><a href="#以上内容参考摘自:" class="headerlink" title="以上内容参考摘自:"></a>以上内容参考摘自:</h3><p>若想深入浅出,推荐阅读以下文章:</p><ul><li><a href="https://blog.csdn.net/guanyonglai/article/details/78468673">图像检索 - 乘积量化 PQ(Product Quantization)</a></li><li><a href="https://blog.csdn.net/CHIERYU/article/details/50321473">最近邻搜索之乘积量化(Product Quantizer)理解(一)</a></li><li><a href="http://www.fabwrite.com/productquantization">实例理解 product quantization 算法</a></li><li><a href="http://vividfree.github.io/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/2017/08/05/understanding-product-quantization">理解 product quantization 算法</a></li><li><a href="https://www.cnblogs.com/houkai/p/9316155.html">Faiss 教程:索引(1)</a></li><li><a href="https://www.cnblogs.com/dingz/p/10519420.html">ANN 中乘积量化与多维倒排小结</a></li></ul>]]></content>
<summary type="html"><p>前言:<br> 记录一下 PQ(乘积量化)的一些学习笔记以及在 faiss 中的运用,方便以后查找。</p>
<h3 id="应用场景:"><a href="#应用场景:" class="headerlink" title="应用场景:"></a>应用场景:</h3><p>解决的是海量数据场景下高维度特征向量数据的近似最近邻快速查找。</p>
<h3 id="Product-Quantization-算法"><a href="#Product-Quantization-算法" class="headerlink" title="Product Quantization 算法"></a>Product Quantization 算法</h3><p>Product Quantization 本质是将原始高维空间分解为有限数量的低维子空间的笛卡尔积,然后分别量化。与此同时也进行了内存空间的压缩。</p></summary>
<category term="Algorithm" scheme="https://luochenxi.github.io/categories/Algorithm/"/>
<category term="Faiss" scheme="https://luochenxi.github.io/tags/Faiss/"/>
<category term="Algorithm" scheme="https://luochenxi.github.io/tags/Algorithm/"/>
</entry>
<entry>
<title>Mac上挂载卸载移动硬盘</title>
<link href="https://luochenxi.github.io/2019/03/18/yuque/Mac%E4%B8%8A%E6%8C%82%E8%BD%BD%E5%8D%B8%E8%BD%BD%E7%A7%BB%E5%8A%A8%E7%A1%AC%E7%9B%98/"/>
<id>https://luochenxi.github.io/2019/03/18/yuque/Mac%E4%B8%8A%E6%8C%82%E8%BD%BD%E5%8D%B8%E8%BD%BD%E7%A7%BB%E5%8A%A8%E7%A1%AC%E7%9B%98/</id>
<published>2019-03-18T03:45:00.000Z</published>
<updated>2020-08-25T10:08:52.748Z</updated>
<content type="html"><![CDATA[<p>前言<br>Mac 通过移动硬盘(ntfs 格式)拷贝文件,在 macOx 系统上 ntfs 格式的移动硬盘默认可以识别但是以 only read 的方式挂载的,需要读写的时候肯能无法使用,这里记录一下使用方法~<br>mac 自带的 mount_ntfs 就可以挂载移动硬盘,默认挂载的时候 ntfs 的硬盘是只读的,所以这个时候需要先 umount 命令进行卸载掉,然后再使用 mount_ntfs 命令以读写的方式挂载。</p><a id="more"></a><h3 id="挂载分区"><a href="#挂载分区" class="headerlink" title="挂载分区"></a>挂载分区</h3><p>解决步骤如下:</p><ol><li>确保移动硬盘链接,查看硬盘挂在的节点,操作如下</li></ol><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> diskutil info /Volumes/YOUR_NTFS_DISK_NAME</span></span><br></pre></td></tr></table></figure><ol start="2"><li>找到 Device Node</li></ol><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Device Node: /dev/disk2s5</span><br></pre></td></tr></table></figure><p>操作示例如下:<br>比如我这里的硬盘默认挂在在/Volumes/Untitled 1.我查看我的硬盘挂在节点信息如下:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> diskutil info /Volumes/Untitled</span></span><br></pre></td></tr></table></figure><p>查找 Device Node 信息</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Device Node: /dev/disk2s5</span><br></pre></td></tr></table></figure><p>这个信息保留下来,后面使用。 2. 然后将硬盘弹出,但是不要拔掉移动硬盘连接,操作如下</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hdiutil eject /Volumes/YOUR_NTFS_DISK_NAME</span><br></pre></td></tr></table></figure><p>我这里的执行如下</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> hdiutil eject /Volumes/Untitled</span></span><br><span class="line">"disk2" unmounted.</span><br><span class="line">"disk2" ejected.</span><br></pre></td></tr></table></figure><p>3.创建一个目录,稍后将 mount 到这个目录</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo mkdir /Volumes/MYHD</span><br></pre></td></tr></table></figure><p>4.将 NTFS 硬盘 挂载 mount 到 mac</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo mount_ntfs -o rw,nobrowse /dev/disk2s5 /Volumes/MYHD/</span><br></pre></td></tr></table></figure><p>5.可以进行任意的写入操作了,比如我这里进行了 copy 操作</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cp ~/doc/A.txt /Volumes/MYHD/b.txt</span><br></pre></td></tr></table></figure><h3 id="卸载分区"><a href="#卸载分区" class="headerlink" title=" 卸载分区"></a> 卸载分区</h3><p>1.首先查看本地已经挂载的 ntfs 硬盘:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> mount | grep ntfs</span></span><br></pre></td></tr></table></figure><p>结果如下:<br>/dev/disk2s5 on /Volumes/MYHD (ntfs, local, noowners, nobrowse) 2.然后卸载该硬盘:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> umount /dev/disk2s5</span></span><br></pre></td></tr></table></figure><p>参考链接:</p><ul><li><a href="https://www.jianshu.com/p/5237811f0ed3">https://www.jianshu.com/p/5237811f0ed3</a></li><li><a href="https://www.xiaorongmao.com/blog/49">https://www.xiaorongmao.com/blog/49</a></li></ul>]]></content>
<summary type="html"><p>前言<br>Mac 通过移动硬盘(ntfs 格式)拷贝文件,在 macOx 系统上 ntfs 格式的移动硬盘默认可以识别但是以 only read 的方式挂载的,需要读写的时候肯能无法使用,这里记录一下使用方法~<br>mac 自带的 mount_ntfs 就可以挂载移动硬盘,默认挂载的时候 ntfs 的硬盘是只读的,所以这个时候需要先 umount 命令进行卸载掉,然后再使用 mount_ntfs 命令以读写的方式挂载。</p></summary>
<category term="mac" scheme="https://luochenxi.github.io/categories/mac/"/>
<category term="mac" scheme="https://luochenxi.github.io/tags/mac/"/>
</entry>
<entry>
<title>Docker常用命令</title>
<link href="https://luochenxi.github.io/2019/03/15/yuque/Docker%E5%B8%B8%E7%94%A8%E5%91%BD%E4%BB%A4/"/>
<id>https://luochenxi.github.io/2019/03/15/yuque/Docker%E5%B8%B8%E7%94%A8%E5%91%BD%E4%BB%A4/</id>
<published>2019-03-15T08:45:00.000Z</published>
<updated>2019-03-15T08:45:00.000Z</updated>
<content type="html"><![CDATA[<p>前言<br>记录一下 docker 常用命令方便查找<br><strong>Docker</strong>可以认为是 vmware 或者 virtualbox<br>• 镜像可以认为是 xxx.iso。<br>• 容器可以认为是 virtualbox 运行 xxx.iso 后的系统。<br>国内镜像加速器:<a href="https://www.daocloud.io/mirror#accelerator-doc">https://www.daocloud.io/mirror#accelerator-doc</a></p><h3 id="登陆镜像仓"><a href="#登陆镜像仓" class="headerlink" title="登陆镜像仓"></a>登陆镜像仓</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> docker login harbor.test-init.com <span class="comment"># 换成自己的镜像仓库地址</span></span></span><br></pre></td></tr></table></figure><h3 id="从镜像仓库拉取镜像"><a href="#从镜像仓库拉取镜像" class="headerlink" title="从镜像仓库拉取镜像"></a>从镜像仓库拉取镜像</h3><p>命令: docker pull 仓库+镜像:版本号</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> eg:</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker pull registry.docker-cn.com/library/ubuntu:16.04 <span class="comment">##从docker中国上拉取</span></span></span><br></pre></td></tr></table></figure><a id="more"></a><h3 id="推送本地镜像到镜像仓库"><a href="#推送本地镜像到镜像仓库" class="headerlink" title="推送本地镜像到镜像仓库"></a>推送本地镜像到镜像仓库</h3><p>命令: docker push 仓库+镜像:版本号</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> eg:</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker push harbor.test.com/deepnet_python/star_face:v1.40</span></span><br></pre></td></tr></table></figure><h3 id="删除镜像"><a href="#删除镜像" class="headerlink" title="删除镜像"></a>删除镜像</h3><h4 id="通过名字和标签删除"><a href="#通过名字和标签删除" class="headerlink" title="通过名字和标签删除"></a>通过名字和标签删除</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> docker rmi fron:tag</span></span><br></pre></td></tr></table></figure><h4 id="强制删除"><a href="#强制删除" class="headerlink" title="强制删除"></a>强制删除</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> docker rmi -f dd6f76d9cc90 <span class="comment"># dd6f76d9cc90 容器ID</span></span></span><br></pre></td></tr></table></figure><h3 id="构建镜像"><a href="#构建镜像" class="headerlink" title="构建镜像"></a>构建镜像</h3><p>命令:docker build</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash">eg:</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker build -f Docker/Dockerfile_face_attribute_cpu -t harbor.meitu-int.com/deepnet/star-predict:v1.0-3 .</span></span><br><span class="line">OPTIONS说明:</span><br><span class="line"> • -f 根据知道dockerfile构建。不指定默认同级目录Dockerfile</span><br></pre></td></tr></table></figure><h3 id="运行镜像"><a href="#运行镜像" class="headerlink" title="运行镜像"></a>运行镜像</h3><h4 id="命令:docker-run"><a href="#命令:docker-run" class="headerlink" title="命令:docker run"></a>命令:docker run</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash">eg:</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker run -i -t -p 3000:3000 ubuntu:14.04 /bin/bash</span></span><br><span class="line"><span class="meta">#</span><span class="bash">eg:</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker run -it --cpuset-cpus=<span class="string">"1-2"</span> -e MODELS_PLUGIN=<span class="string">"face_detection/20181114.tar;mtcnn/20181127.tar;star_face_recognition/20181225.tar"</span> -p 8896:5000 --entrypoint /usr/bin/python harbor.test-int.com/deepnet/star-predict:v1.0-4 app.py 5000</span></span><br><span class="line">OPTIONS说明:</span><br><span class="line"> • -i 显示info级别log信息</span><br><span class="line"> • -t 显示控制台</span><br><span class="line"> • -p 3000:3000 把容器的3000端口映射到本机3000端口</span><br><span class="line"> • -d=true 后台启动docker</span><br><span class="line"> • -v /data:/data 主机的目录映射到容器目录</span><br><span class="line"> • --cpuset-cpus 限制CPU核数</span><br><span class="line"> • -e 设置环境变量</span><br><span class="line"> • --entrypoint:指定入口点, 格式为 命令位置 + 镜像名以及tag + 命令参数(多个参数空格隔开)</span><br><span class="line"> • --rm:容器退出时就能够自动清理容器内部的文件系统</span><br></pre></td></tr></table></figure><h4 id="运行已经退出的容器"><a href="#运行已经退出的容器" class="headerlink" title="运行已经退出的容器"></a>运行已经退出的容器</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> docker ps -a <span class="comment">## 查看容器列表</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker start 容器名或容器ID</span></span><br><span class="line"></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker attach 容器名或容器ID</span></span><br><span class="line"></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker <span class="built_in">exec</span> -it 容器名或容器ID /bin/bash (进入正在运行中的程序)</span></span><br></pre></td></tr></table></figure><h4 id="创建一个守护态的-Docker-容器,然后使用-docker-attach-命令进入该容器。"><a href="#创建一个守护态的-Docker-容器,然后使用-docker-attach-命令进入该容器。" class="headerlink" title="创建一个守护态的 Docker 容器,然后使用 docker attach 命令进入该容器。"></a>创建一个守护态的 Docker 容器,然后使用 docker attach 命令进入该容器。</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash">创建一个守护态的Docker容器,然后使用docker attach命令进入该容器。</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> sudo docker run -itd ubuntu:14.04 /bin/bash</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash">然后我们使用docker ps查看到该容器信息,接下来就使用docker attach进入该容器</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> sudo docker attach 44fc0f0582d9</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash"> 遇到docker attach卡着 解决方法:</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> sudo docker <span class="built_in">exec</span> -it containerID /bin/bash</span></span><br></pre></td></tr></table></figure><h3 id="查看容器列表"><a href="#查看容器列表" class="headerlink" title="查看容器列表"></a>查看容器列表</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> docker ps</span></span><br><span class="line">OPTIONS说明:</span><br><span class="line"> • -a :显示所有的容器,包括未运行的。</span><br><span class="line"> • -f :根据条件过滤显示的内容。</span><br><span class="line"> • --format :指定返回值的模板文件。</span><br><span class="line"> • -l :显示最近创建的容器。</span><br><span class="line"> • -n :列出最近创建的n个容器。</span><br><span class="line"> • --no-trunc :不截断输出。</span><br><span class="line"> • -q :静默模式,只显示容器编号。</span><br><span class="line"> • -s :显示总的文件大小。</span><br></pre></td></tr></table></figure><h3 id="把容器打成镜像"><a href="#把容器打成镜像" class="headerlink" title="把容器打成镜像:"></a>把容器打成镜像:</h3><p>命令:docker commit</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> docker commit 9b5d9f58efe1 luochenxi/docker-dev-init:0.1</span></span><br><span class="line">说明:</span><br><span class="line"> • 9b5d9f58efe1 是容器的id</span><br><span class="line"> • luochenxi (也可以是镜像仓库的地址)是你注册的https://store.docker.com/的名字,如果你没有的话,那需要先注册</span><br><span class="line"> • docker-dev-init 是你为该镜像起的名字</span><br><span class="line"> • :0.1 是镜像的版本号,默认是latest版本</span><br></pre></td></tr></table></figure><h3 id="查看构建历史"><a href="#查看构建历史" class="headerlink" title="查看构建历史"></a>查看构建历史</h3><p>命令:docker history</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> docker <span class="built_in">history</span> b16523cedeb6 <span class="comment"># b16523cedeb6 容器ID</span></span></span><br></pre></td></tr></table></figure><h3 id="查看到容器的相关信息"><a href="#查看到容器的相关信息" class="headerlink" title="查看到容器的相关信息"></a>查看到容器的相关信息</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> docker inspect 容器id</span></span><br></pre></td></tr></table></figure><h3 id="查看容器日志"><a href="#查看容器日志" class="headerlink" title="查看容器日志"></a>查看容器日志</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker logs 容器id</span><br></pre></td></tr></table></figure><h3 id="镜像打标签"><a href="#镜像打标签" class="headerlink" title="镜像打标签"></a>镜像打标签</h3><p>命令:docker tag</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash">eg:</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker tag ubuntu:16.04 ubuntu:16.04_newtag</span></span><br></pre></td></tr></table></figure><h3 id="容器重命名"><a href="#容器重命名" class="headerlink" title="容器重命名"></a>容器重命名</h3><p>命令: docker rename old 容器名 new 容器名</p><h3 id="空间不足进行清理"><a href="#空间不足进行清理" class="headerlink" title="空间不足进行清理"></a>空间不足进行清理</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"><span class="comment"># 异常现象</span></span></span><br><span class="line">no space left on device docker</span><br><span class="line"></span><br><span class="line"><span class="meta">#</span><span class="bash"><span class="comment"># 清除-- None的docker images</span></span></span><br><span class="line">docker stop $(docker ps -a | grep "Exited" | awk '{print $1 }')</span><br><span class="line">docker rm $(docker ps -a | grep "Exited" | awk '{print $1 }')</span><br><span class="line">docker rmi $(docker images | grep "none" | awk '{print $3}')</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><p>前言<br>记录一下 docker 常用命令方便查找<br><strong>Docker</strong>可以认为是 vmware 或者 virtualbox<br>• 镜像可以认为是 xxx.iso。<br>• 容器可以认为是 virtualbox 运行 xxx.iso 后的系统。<br>国内镜像加速器:<a href="https://www.daocloud.io/mirror#accelerator-doc">https://www.daocloud.io/mirror#accelerator-doc</a></p>
<h3 id="登陆镜像仓"><a href="#登陆镜像仓" class="headerlink" title="登陆镜像仓"></a>登陆镜像仓</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span><span class="bash"> docker login harbor.test-init.com <span class="comment"># 换成自己的镜像仓库地址</span></span></span><br></pre></td></tr></table></figure>
<h3 id="从镜像仓库拉取镜像"><a href="#从镜像仓库拉取镜像" class="headerlink" title="从镜像仓库拉取镜像"></a>从镜像仓库拉取镜像</h3><p>命令: docker pull 仓库+镜像:版本号</p>
<figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#</span><span class="bash"> eg:</span></span><br><span class="line"><span class="meta">$</span><span class="bash"> docker pull registry.docker-cn.com/library/ubuntu:16.04 <span class="comment">##从docker中国上拉取</span></span></span><br></pre></td></tr></table></figure></summary>
<category term="docker" scheme="https://luochenxi.github.io/categories/docker/"/>
<category term="docker" scheme="https://luochenxi.github.io/tags/docker/"/>
</entry>
<entry>
<title>JavaSe备忘录01-关于数据类型</title>
<link href="https://luochenxi.github.io/2019/03/03/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9501-%E5%85%B3%E4%BA%8E%E6%95%B0%E6%8D%AE%E7%B1%BB%E5%9E%8B/"/>
<id>https://luochenxi.github.io/2019/03/03/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9501-%E5%85%B3%E4%BA%8E%E6%95%B0%E6%8D%AE%E7%B1%BB%E5%9E%8B/</id>
<published>2019-03-02T16:00:00.000Z</published>
<updated>2020-08-25T10:08:52.880Z</updated>
<content type="html"><![CDATA[<p> class="lazyload" data-src="https://cdn.nlark.com/yuque/0/2019/png/283236/1552631004536-5ab1a582-7e9e-469e-94b1-07986b168707.png#align=left&display=inline&height=516&margin=%5Bobject%20Object%5D&name=&originHeight=516&originWidth=800&size=0&status=done&style=none&width=800" <img src="/"></p><h3 id="1-Java-基础数据类型"><a href="#1-Java-基础数据类型" class="headerlink" title="1:Java 基础数据类型"></a>1:Java 基础数据类型</h3><p>公式:1byte=8bit;bit 是内存中 0,1 代码存储的最小单位<strong>。</strong></p><a id="more"></a><h4 id="1-1-整数:包含正整数,负整数和-0。"><a href="#1-1-整数:包含正整数,负整数和-0。" class="headerlink" title="1.1 整数:包含正整数,负整数和 0。"></a>1.1 整数:包含正整数,负整数和 0。</h4><p>byte(1 字节) 范围:(-128 — 127)<br>short(2 字节)<br>int(4 字节)<br>long(8 字节)</p><h4 id="1-2-浮点型:精度-单精度和双精度"><a href="#1-2-浮点型:精度-单精度和双精度" class="headerlink" title="1.2 浮点型:精度(单精度和双精度)"></a>1.2 浮点型:精度(单精度和双精度)</h4><p>float(单精度)(4 字节)<br>double(双精度)(8 字节)</p><h4 id="1-3-字符型"><a href="#1-3-字符型" class="headerlink" title="1.3 字符型"></a>1.3 字符型</h4><p>char(2 字节)</p><h4 id="1-4-布尔型"><a href="#1-4-布尔型" class="headerlink" title="1.4 布尔型"></a>1.4 布尔型</h4><p>boolean(1 字节)</p><h4 id="1-5-小结:"><a href="#1-5-小结:" class="headerlink" title="1.5 小结:"></a>1.5 小结:</h4><p>数据类型,其实就是决定在 jvm 中,开辟多大内存空间大小,来存储你的字面值。字面值最终转换成二进制编码,进行存储。<br>因为不管任何语言只要在计算机中进行操作,都会转换成机器 01 编码。只不过这些操作都是语言帮我们自动处理了。<br>在开发中,定义变量一定都有规则,根据字面值去选择一个合理的数据类型。</p><h3 id="2-什么是变量:"><a href="#2-什么是变量:" class="headerlink" title="2:什么是变量:"></a>2:什么是变量:</h3><ul><li>变量是有数据类型,变量名和字面值构成一个完整体。举个栗子:</li></ul><p>int age = 26<br>格式:数据类型 变量名 =字面值;</p><ul><li>变量名(标识符)的注意事项:</li></ul><p>以字母,$_开头,其他随意(除了 java 的关键字),中间不包含特殊字符(#,@)和空格等其他字段。<br>同样适用于:方法和类的名称</p><ul><li>{} 就是一个作用域</li></ul><p>一个作用域中不能拥有相同的两个或多个变量名。</p><h3 id="3-java-注释:"><a href="#3-java-注释:" class="headerlink" title="3:java 注释:"></a>3:java 注释:</h3><ul><li>单行注释://</li><li>多行注释:/** **/</li><li>文档注释:javadoc 文档注释的规则,它一定用于类,方法和属性中,用来提示和生成 javadoc 文档的的。</li></ul><h3 id="总结:"><a href="#总结:" class="headerlink" title="总结:"></a>总结:</h3><ol><li>数据类型运算的时候,我们都是使用 int 接受,或者比 int 更大范围的数据类型区接受。比如:long,float,double.</li><li>short a=10;short b=1000;这两个相乘按照常识 short c=a*b;但是会发现一个问题这两个相乘会超出 short 的范围。所以 jdk 在设计的阶段就用所以的运算都是用 int 原因。</li><li>整数和浮点型一个类型大小的比较的过程,存在自动转换,强制类型转换。<ul><li>byte<short<int<long<float<double —-自动类型转换</li></ul></li><li>float 类型定义是比较特殊的,定义过程中,一定在结尾加 f 或者 F,比如 float d=3.5f;float c=100f;</li><li>如果 float 接受的是一个整数,不需要加 f 或者 F,但是如果时小数一定要加 f 或者 F。因为 java 一个小数的默认数据类型是:double</li><li>long 类型在定义的时候,一定要加 L 或者 l。比如 long c=100L;long cc=3454434534L;</li><li>浮点类型中,默认数据类型是 double,它的定义中尾巴 d 是可以省略。比如 double c=3.0345;double cc =3445.644645d;</li><li>如果 0.445 小数,定义过程中前面 0 是可以省去的。比如:float a=0.35f;float b=.89f;(0.89f);</li><li>强制类型转换,会造成精度的丢失。所以在开发过程中,一定要谨慎使用。比如 double c=5.8;int d=(int)c;</li><li>jdk7+新增一些数据类型的接受,在定义数据类型的时候,我们所看的数字都是十进制,0-9.</li><li>而在程序里面,数字有十进制,八进制,十六进制,二进制(jdk7 以后也支持整型和浮点型能够接受二进制)。</li><li>十进制转二进制,手算技巧:看到奇数写 1,看到偶数写 0。</li></ol>]]></content>
<summary type="html"><p><img src="https://cdn.nlark.com/yuque/0/2019/png/283236/1552631004536-5ab1a582-7e9e-469e-94b1-07986b168707.png#align=left&display=inline&height=516&margin=%5Bobject%20Object%5D&name=&originHeight=516&originWidth=800&size=0&status=done&style=none&width=800"></p>
<h3 id="1-Java-基础数据类型"><a href="#1-Java-基础数据类型" class="headerlink" title="1:Java 基础数据类型"></a>1:Java 基础数据类型</h3><p>公式:1byte=8bit;bit 是内存中 0,1 代码存储的最小单位<strong>。</strong></p></summary>
<category term="JAVA" scheme="https://luochenxi.github.io/categories/JAVA/"/>
<category term="JAVA" scheme="https://luochenxi.github.io/tags/JAVA/"/>
</entry>
<entry>
<title>JavaSe备忘录02-添加判断和字符char的认识</title>
<link href="https://luochenxi.github.io/2019/03/03/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9502-%E6%B7%BB%E5%8A%A0%E5%88%A4%E6%96%AD%E5%92%8C%E5%AD%97%E7%AC%A6char%E7%9A%84%E8%AE%A4%E8%AF%86/"/>
<id>https://luochenxi.github.io/2019/03/03/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9502-%E6%B7%BB%E5%8A%A0%E5%88%A4%E6%96%AD%E5%92%8C%E5%AD%97%E7%AC%A6char%E7%9A%84%E8%AE%A4%E8%AF%86/</id>
<published>2019-03-02T16:00:00.000Z</published>
<updated>2020-08-25T10:08:52.840Z</updated>
<content type="html"><![CDATA[<h3 id="1-java-分为两种数据类型:基础数据类型和封装数据类型"><a href="#1-java-分为两种数据类型:基础数据类型和封装数据类型" class="headerlink" title="1: java 分为两种数据类型:基础数据类型和封装数据类型"></a>1: java 分为两种数据类型:基础数据类型和封装数据类型</h3><ul><li>整型:byte(1 字节),short(2 字节),int(4 字节),long(8 字节)</li><li>浮点型:float(4 字节),double(8 字节)</li><li>字符串:char(2 字节)</li><li>布尔型:boolean(1 字节)</li></ul><p><strong>字节数就是决定数据类型的内存空间的大小,也决定了数据类型的字面值范围。在开发过程中,我们定义个变量,合理数据类型的选择是根据你的字面值去决定的,反之:什么样子字面值就决定你选择一个合理的数据类型。</strong></p><a id="more"></a><ul><li>最大值最小值的计算:</li></ul><p>最小值:2 的指数(字节 x 位数-1)<br>最大值:2 的指数(字节 x 位数-1)-1<br>比如:byte(1 字节) 最小值(-2^(1x8-1) <—> 最大值 2^(1x8-1))</p><h3 id="2-字符"><a href="#2-字符" class="headerlink" title="2: 字符"></a>2: 字符</h3><h4 id="2-1:-什么是字符集"><a href="#2-1:-什么是字符集" class="headerlink" title="2.1: 什么是字符集"></a>2.1: 什么是字符集</h4><p>字符集:通常意义就是单个字符,<strong>字符必须以单引号引起来</strong>(’’)。 Java 语言是 16 位 Unicode 编码方式。<br>eg:<br>har a = ‘A’;<br>其实计算机,是无法保存电影,音乐,图片,字符。计算机只能保存二进制。因此电影,音乐,图片都是先需要转成二进制的方式,才能够保存。</p><ul><li>** Unicode 表 A-Z a-z 0-9 标点符合,空格,回车等等,都是占用一个字符**</li><li>** 中文:占用两个字符。**</li><li>** char 和 int 之间是可以相互转换—-都参照 ascii 表进行获取计算。 0—48 A—65 a—97**</li><li></li></ul><h4 id="2-2:-char-int-和-string-的关系"><a href="#2-2:-char-int-和-string-的关系" class="headerlink" title="2.2: char,int 和 string 的关系"></a>2.2: char,int 和 string 的关系</h4><ol><li>char 和 string 有什么关系?```<br>基础数据类型的类型是没有字符串的。<br>jdk ———— 字符串专门处理弥补单字符的问题。字符串只不过是一连串的单字符的组合。</li></ol><p>Stirng, 它属于引用数据类型,它是一个类。<br>char 有一些特殊字符,专门给字符串提供换行,回车,单引号<br>char 中的转义符\,专门来处理字符串的特殊符合。</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"></span><br><span class="line"></span><br><span class="line"></span><br><span class="line">#### 2.3: 类</span><br></pre></td></tr></table></figure><p>类只有三样东西:方法,属性,代码块。<br>类中的方法和属性都是提供给我们调用者使用的。<br>调用方法过程,其实就告诉 jvm 中去寻找有没有这个方法,如果有就通知 cpu 去计算方法体里面的代码内容。</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">### 3: 判断</span><br><span class="line"></span><br><span class="line">#### 3.1 比较:一定是两个物体之间才有比较关系,一定已知量(具体的值)一个未知量(变量)</span><br><span class="line">```java</span><br><span class="line">a. if在添加判断:</span><br><span class="line">格式:</span><br><span class="line"> if(boolean){</span><br><span class="line"> //true体</span><br><span class="line"> }else{</span><br><span class="line"> //false体</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line">b. 多种选择</span><br><span class="line">if(true){</span><br><span class="line"> //true体</span><br><span class="line">}else if(true){</span><br><span class="line"> //true体</span><br><span class="line">}else{</span><br><span class="line"> //false体</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="3-2:-逻辑符:-amp-amp-并且(且)-或者-非-!-取非"><a href="#3-2:-逻辑符:-amp-amp-并且(且)-或者-非-!-取非" class="headerlink" title="3.2: 逻辑符:&& 并且(且) 或者 || 非 !(取非)"></a>3.2: 逻辑符:&& 并且(且) 或者 || 非 !(取非)</h4><h3 id="4-总结"><a href="#4-总结" class="headerlink" title="4: 总结"></a>4: 总结</h3><ol><li>char 和 int 的关系,他们在内部是可以互相转换的。**因为 char 单个字符最终都转成 int 类型十进制,在转成二进制进行存储。</li><li>char 字节 2 位,int 是 4 位。也就是说单个字符 char 是可以被 int 所接受的,输入自动类型转换</li><li>字符 ascii 码:0—48 A—65 a—97。</li></ol>]]></content>
<summary type="html"><h3 id="1-java-分为两种数据类型:基础数据类型和封装数据类型"><a href="#1-java-分为两种数据类型:基础数据类型和封装数据类型" class="headerlink" title="1: java 分为两种数据类型:基础数据类型和封装数据类型"></a>1: java 分为两种数据类型:基础数据类型和封装数据类型</h3><ul>
<li>整型:byte(1 字节),short(2 字节),int(4 字节),long(8 字节)</li>
<li>浮点型:float(4 字节),double(8 字节)</li>
<li>字符串:char(2 字节)</li>
<li>布尔型:boolean(1 字节)</li>
</ul>
<p><strong>字节数就是决定数据类型的内存空间的大小,也决定了数据类型的字面值范围。在开发过程中,我们定义个变量,合理数据类型的选择是根据你的字面值去决定的,反之:什么样子字面值就决定你选择一个合理的数据类型。</strong></p></summary>
<category term="JAVA" scheme="https://luochenxi.github.io/categories/JAVA/"/>
<category term="JAVA" scheme="https://luochenxi.github.io/tags/JAVA/"/>
</entry>
<entry>
<title>JavaSe备忘录03-关于操作符,运算符和位运算符</title>
<link href="https://luochenxi.github.io/2019/03/03/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9503-%E5%85%B3%E4%BA%8E%E6%93%8D%E4%BD%9C%E7%AC%A6,%E8%BF%90%E7%AE%97%E7%AC%A6%E5%92%8C%E4%BD%8D%E8%BF%90%E7%AE%97%E7%AC%A6/"/>
<id>https://luochenxi.github.io/2019/03/03/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9503-%E5%85%B3%E4%BA%8E%E6%93%8D%E4%BD%9C%E7%AC%A6,%E8%BF%90%E7%AE%97%E7%AC%A6%E5%92%8C%E4%BD%8D%E8%BF%90%E7%AE%97%E7%AC%A6/</id>
<published>2019-03-02T16:00:00.000Z</published>
<updated>2020-08-25T10:08:52.796Z</updated>
<content type="html"><![CDATA[<h2 id="运算符"><a href="#运算符" class="headerlink" title="运算符"></a>运算符</h2><p>Java 运算符,<strong>是一种特殊的符号</strong>。用表示数据的运算,赋值和比较。</p><ul><li>算数运算符</li><li>赋值运算符</li><li>比较运算符</li><li>位运算符</li><li>逻辑运算符</li><li>三目运算符</li></ul><a id="more"></a><h3 id="1-算数运算符"><a href="#1-算数运算符" class="headerlink" title="1. 算数运算符"></a>1. 算数运算符</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">符号:加(+),减(-),乘(*),除(/),取余(%)</span><br><span class="line">自增:++</span><br><span class="line">自减:--</span><br><span class="line"></span><br><span class="line">作用在于:整数和浮点型。</span><br><span class="line">最终的结果:十进制。只要是byte,short,int,long,char,float,double相加结构都是十进制。</span><br><span class="line">所有的运算的结果的最小范围一定是:---int</span><br></pre></td></tr></table></figure><p>a. 自增:++</p><ul><li>不同行:```java<br>int a =1;<br>a++;<br>++a;<br>System.out.println(a); //3 a++ 与 ++a 在不同行效果是等价的。</li></ul><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">- 相同行:_在运算中,或者逻辑判断中,或者方法的传参中就存在差异了_```java</span><br><span class="line">int a =5;</span><br><span class="line">int b = a++ + 6;</span><br><span class="line">System.out.println(b); // 11</span><br><span class="line">System.out.println(a); // 6</span><br><span class="line"></span><br><span class="line">int a1 =5;</span><br><span class="line">int b1 = ++a1 + 6;</span><br><span class="line">System.out.println(b1); // 12</span><br><span class="line">System.out.println(a1); // 6</span><br><span class="line"></span><br><span class="line">分析[int b = a++ + 6;]运算步骤: 先运算再递加。</span><br><span class="line">step1: a + 6 = 11</span><br><span class="line">step2: a++ = 6</span><br><span class="line">因此结果是 b=11,a=6</span><br><span class="line"></span><br><span class="line">分析[int b = ++a + 6;]运算步骤: 先递加再运算。</span><br><span class="line">step1: ++a1 = 6</span><br><span class="line">step2: a1+ 6 = 12</span><br><span class="line">因此结果是 b1=12,a1=6</span><br></pre></td></tr></table></figure><p>b. 自减:–<br>跟自增++同理</p><h3 id="2-赋值运算符"><a href="#2-赋值运算符" class="headerlink" title="2. 赋值运算符"></a>2. 赋值运算符</h3><ul><li>等号 =,只要赋值了就修改内存控制的二进制</li><li>基础数据类型的默认值:</li><li>byte short int long 都是 0</li><li>float double == 0.0</li><li>boolean == false</li><li>char==\u0000 空格</li><li>赋值运算符的扩展:</li><li>+=,-=,*=,/=,%=</li></ul><h3 id="3-比较运算符"><a href="#3-比较运算符" class="headerlink" title="3. 比较运算符"></a>3. 比较运算符</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">< > == >= <= != 用于条件判断中</span><br></pre></td></tr></table></figure><p>a. == 逻辑判断,是内存地址的比较-(二进制),还需考虑前面的数据类型。<br>b. = 赋值<br>** “==” 数字是值的比较,对象(封装数据类型)是内存地址的比较。**<br>** 如果是基础数据类型值比较==,值相等那么都是 true,如果是封装数据类型都是 false—hashCode()**</p><h3 id="4-逻辑运算符"><a href="#4-逻辑运算符" class="headerlink" title="4. 逻辑运算符"></a>4. 逻辑运算符</h3><ul><li>短路:或 || 且 && 非 !</li><li>不短路:或 | 且 & 非 ^ (从性能上讲,不推荐使用。)</li></ul><h3 id="5-三目运算符"><a href="#5-三目运算符" class="headerlink" title="5. 三目运算符"></a>5. 三目运算符</h3><p>格式:(逻辑判断(boolean))?true:fakse;</p><h3 id="6-位运算符-7-个-—计算"><a href="#6-位运算符-7-个-—计算" class="headerlink" title="6. 位运算符(7 个) —计算"></a>6. 位运算符(7 个) —计算</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">>> 右移运算符</span><br><span class="line">>>> 无符号右移</span><br><span class="line"><< 左移,二进制整体往左边移动两位,空白位用0填充。</span><br><span class="line">& 按位与,如果两个同时为1返回就是1,其他的都是0。</span><br><span class="line">| 按位或,如果有一位是1那么就是1,其他的都是0。</span><br><span class="line">^ 按位异或,两个相同的时候0,不同时返回1。</span><br><span class="line">~ 按位非,单数字取反</span><br><span class="line">作用:快速计算一个数字想要得到结果,最终都是以二进制的方式进行计算。</span><br><span class="line">思路: 把十进制转成二进制,再进行移位操作。</span><br><span class="line"></span><br><span class="line">true & false ---逻辑符合</span><br><span class="line">5 & 9 ---位运算</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h2 id="运算符"><a href="#运算符" class="headerlink" title="运算符"></a>运算符</h2><p>Java 运算符,<strong>是一种特殊的符号</strong>。用表示数据的运算,赋值和比较。</p>
<ul>
<li>算数运算符</li>
<li>赋值运算符</li>
<li>比较运算符</li>
<li>位运算符</li>
<li>逻辑运算符</li>
<li>三目运算符</li>
</ul></summary>
<category term="JAVA" scheme="https://luochenxi.github.io/categories/JAVA/"/>
<category term="JAVA" scheme="https://luochenxi.github.io/tags/JAVA/"/>
</entry>
<entry>
<title>感知哈希算法</title>
<link href="https://luochenxi.github.io/2018/07/26/yuque/%E6%84%9F%E7%9F%A5%E5%93%88%E5%B8%8C%E7%AE%97%E6%B3%95/"/>
<id>https://luochenxi.github.io/2018/07/26/yuque/%E6%84%9F%E7%9F%A5%E5%93%88%E5%B8%8C%E7%AE%97%E6%B3%95/</id>
<published>2018-07-26T07:42:47.000Z</published>
<updated>2020-08-25T10:08:52.456Z</updated>
<content type="html"><![CDATA[<p>感知哈希算法(Perceptual hash algorithm )是哈希算法的一类,通常用来进行图像相似检索。</p><p>常用的有三种:均值哈希(aHash),感知哈希(pHash),差异化哈希(dHash)</p><ul><li>aHash:平均值哈希。速度比较快,但是常常不太精确。</li><li>pHash:感知哈希。精确度比较高,但是速度方面较差一些。</li><li>dHash:差异值哈希。Amazing!精确度较高,且速度也非常快。</li></ul><h3 id="1-基于低频的均值哈希-aHash"><a href="#1-基于低频的均值哈希-aHash" class="headerlink" title="1. 基于低频的均值哈希(aHash)"></a>1. 基于低频的均值哈希(aHash)</h3><p>一张图片就是一个二维信号,它包含了不同频率的成分。如下图所示,亮度变化小的区域是低频成分,它描述大范围的信息。而亮度变化剧烈的区域(比如物体的边缘)就是高频的成分,它描述具体的细节。或者说高频可以提供图片详细的信息,而低频可以提供一个框架。<br> class="lazyload" data-src="https://cdn.nlark.com/yuque/0/2019/png/283236/1562749582502-2bee275c-a810-4816-8e4f-b74a47495c9c.png#align=left&display=inline&height=215&margin=%5Bobject%20Object%5D&name=&originHeight=215&originWidth=707&size=0&status=done&style=none&width=707" <img src="/"><br>而一张大的,详细的图片有很高的频率,而小图片缺乏图像细节,所以都是低频的。所以我们平时的下采样,也就是缩小图片的过程,实际上是损失高频信息的过程。</p><a id="more"></a><p> class="lazyload" data-src="https://cdn.nlark.com/yuque/0/2019/png/283236/1562749627606-986f7034-bc7b-4661-b92a-aaf161089a4d.png#align=left&display=inline&height=139&margin=%5Bobject%20Object%5D&name=&originHeight=139&originWidth=352&size=0&status=done&style=none&width=352" <img src="/"><br>均值哈希算法主要是利用图片的低频信息,其工作过程如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">(1)缩小尺寸:去除高频和细节的最快方法是缩小图片,将图片缩小到8x8的尺寸,总共64个像素。不要保持纵横比,只需将其变成8*8的正方形。这样就可以比较任意大小的图片,摒弃不同尺寸、比例带来的图片差异。</span><br><span class="line"></span><br><span class="line">(2)简化色彩:将8*8的小图片转换成灰度图像。</span><br><span class="line"></span><br><span class="line">(3)计算平均值:计算所有64个像素的灰度平均值。</span><br><span class="line"></span><br><span class="line">(4)比较像素的灰度:将每个像素的灰度,与平均值进行比较。大于或等于平均值,记为1;小于平均值,记为0。</span><br><span class="line"></span><br><span class="line">(5)计算hash值:将上一步的比较结果,组合在一起,就构成了一个64位的整数,这就是这张图片的指纹。组合的次序并不重要,只要保证所有图片都采用同样次序就行了。(我设置的是从左到右,从上到下用二进制保存)。</span><br></pre></td></tr></table></figure><p>如果图片放大或缩小,或改变纵横比,结果值也不会改变。增加或减少亮度或对比度,或改变颜色,对 hash 值都不会太大的影响。最大的优点:计算速度快!</p><h5 id="均值哈希-python-版本:"><a href="#均值哈希-python-版本:" class="headerlink" title="均值哈希 python 版本:"></a>均值哈希 python 版本:</h5><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_ahash</span>(<span class="params">img</span>):</span></span><br><span class="line"> image = cv2.resize(img,(<span class="number">8</span>,<span class="number">8</span>),interpolation=cv2.INTER_AREA) <span class="comment"># step1: resize img</span></span><br><span class="line"> <span class="string">"""resize img 8x8</span></span><br><span class="line"><span class="string"> interpolation参数用于告诉函数怎么插值计算输出图像的像素值。</span></span><br><span class="line"><span class="string"> INTER_AREA在缩小和放大图像时,是完全不一样的。</span></span><br><span class="line"><span class="string"> 详解参阅:https://zhuanlan.zhihu.com/p/38493205</span></span><br><span class="line"><span class="string"> """</span></span><br><span class="line"> image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) <span class="comment"># step2: 将8*8的小图片转换成灰度图像(色彩空间转化 BGR↔Gray的转换)</span></span><br><span class="line"></span><br><span class="line"> average = np.mean(image) <span class="comment"># step3: 计算所有64个像素的灰度平均值。</span></span><br><span class="line"> image[np.where(image<=average)] = <span class="number">0</span></span><br><span class="line"> <span class="string">"""</span></span><br><span class="line"><span class="string"> step4:比较像素的灰度:将每个像素的灰度,与平均值进行比较。大于或等于平均值,记为1;小于平均值,记为0。</span></span><br><span class="line"><span class="string"> """</span></span><br><span class="line"> image[np.where(image>average)] = <span class="number">1</span></span><br><span class="line"> image = image.astype(<span class="string">'uint8'</span>)</span><br><span class="line"> image=image.flatten() <span class="comment"># 生成:图片的指纹,flatten函数返回一个折叠成 一维 的数组。</span></span><br><span class="line"> <span class="keyword">return</span> image</span><br></pre></td></tr></table></figure><h3 id="2-增强版:pHash"><a href="#2-增强版:pHash" class="headerlink" title="2. 增强版:pHash"></a>2. 增强版:pHash</h3><p>均值哈希虽然简单,但受均值的影响非常大。例如对图像进行伽马校正或直方图均衡就会影响均值,从而影响最终的 hash 值。存在一个更健壮的算法叫 pHash。它将均值的方法发挥到极致。使用离散余弦变换(DCT)来获取图片的低频成分。<br>离散余弦变换(DCT)是种图像压缩算法,它将图像从像素域变换到频率域。然后一般图像都存在很多冗余和相关性的,所以转换到频率域之后,只有很少的一部分频率分量的系数才不为 0,大部分系数都为 0(或者说接近于 0)。下图的右图是对 lena 图进行离散余弦变换(DCT)得到的系数矩阵图。从左上角依次到右下角,频率越来越高,由图可以看到,左上角的值比较大,到右下角的值就很小很小了。换句话说,图像的能量几乎都集中在左上角这个地方的低频系数上面了。<br> class="lazyload" data-src="https://cdn.nlark.com/yuque/0/2019/png/283236/1562751740655-5d3d708e-2d61-448f-b138-28e4f02ce484.png#align=left&display=inline&height=272&margin=%5Bobject%20Object%5D&name=&originHeight=272&originWidth=610&size=0&status=done&style=none&width=610" <img src="/"><br>pHash 的工作过程如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">(1)缩小尺寸:pHash以小图片开始,但图片大于8*8,32*32是最好的。这样做的目的是简化了DCT的计算,而不是减小频率。</span><br><span class="line"></span><br><span class="line">(2)简化色彩:将图片转化成灰度图像,进一步简化计算量。</span><br><span class="line"></span><br><span class="line">(3)计算DCT:计算图片的DCT变换,得到32*32的DCT系数矩阵。</span><br><span class="line"></span><br><span class="line">(4)缩小DCT:虽然DCT的结果是32*32大小的矩阵,但我们只要保留左上角的8*8的矩阵,这部分呈现了图片中的最低频率。</span><br><span class="line"></span><br><span class="line">(5)计算平均值:如同均值哈希一样,计算DCT的均值。</span><br><span class="line"></span><br><span class="line">(6)计算hash值:这是最主要的一步,根据8*8的DCT矩阵,设置0或1的64位的hash值,大于等于DCT均值的设为”1”,小于DCT均值的设为“0”。组合在一起,就构成了一个64位的整数,这就是这张图片的指纹。</span><br></pre></td></tr></table></figure><h5 id="pHash-python-版本:"><a href="#pHash-python-版本:" class="headerlink" title="pHash python 版本:"></a>pHash python 版本:</h5><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_phash2</span>(<span class="params">img</span>):</span></span><br><span class="line"> img=cv2.resize(img,(<span class="number">32</span>,<span class="number">32</span>),interpolation=cv2.INTER_AREA) <span class="comment"># step1:resize img 32x32</span></span><br><span class="line"> img=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) <span class="comment"># step2: 将图片转化成灰度图像,进一步简化计算量。</span></span><br><span class="line"> h,w=img.shape</span><br><span class="line"> img_2=np.zeros((h,w),dtype=np.float32)</span><br><span class="line"> img_2[:h,:w]=img</span><br><span class="line"> img_dct=cv2.dct(cv2.dct(img_2)) <span class="comment"># step3: 计算图片的DCT变换,得到32*32的DCT系数矩阵。</span></span><br><span class="line"> image=img_dct[:<span class="number">8</span>,:<span class="number">8</span>] <span class="comment"># step4: 缩小DCT:虽然DCT的结果是32*32大小的矩阵,但我们只要保留左上角的8*8的矩阵,这部分呈现了图片中的最低频率。</span></span><br><span class="line"> average = np.mean(image) <span class="comment">#step5: 计算平均值:如同均值哈希一样,计算DCT的均值。</span></span><br><span class="line"> image_val=image.copy()</span><br><span class="line"> image[np.where(image_val<=average)] = <span class="number">0</span></span><br><span class="line"> <span class="string">"""</span></span><br><span class="line"><span class="string"> setp6:计算hash值:这是最主要的一步,根据8*8的DCT矩阵,设置0或1的64位的hash值,大于等于DCT均值的设为”1”,小于DCT均值的设为“0”。组合在一起,就构成了一个64位的整数,这就是这张图片的指纹。</span></span><br><span class="line"><span class="string"> """</span></span><br><span class="line"> image[np.where(image_val>average)] = <span class="number">1</span></span><br><span class="line"> image = image.astype(<span class="string">'uint8'</span>)</span><br><span class="line"> image=image.flatten()</span><br><span class="line"> <span class="keyword">return</span> image</span><br></pre></td></tr></table></figure><h3 id="3-差异哈希算法(dHash)"><a href="#3-差异哈希算法(dHash)" class="headerlink" title="3. 差异哈希算法(dHash)"></a>3. 差异哈希算法(dHash)</h3><p>相比 pHash,dHash 的速度要快的多,相比 aHash,dHash 在效率几乎相同的情况下的效果要更好,它是基于渐变实现的。<br>dHash 的工作过程如下:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">(1) 缩小图片:收缩到9*8的大小,缩小后有72的像素点。</span><br><span class="line"></span><br><span class="line">(2) 转化为灰度图:把缩放后的图片转化为256阶的灰度图。(具体算法见平均哈希算法步骤)</span><br><span class="line"></span><br><span class="line">(3) 计算差异值:dHash算法工作在相邻像素之间,这样每行9个像素之间产生了8个不同的差异,一共8行,则产生了64个差异值。</span><br><span class="line"></span><br><span class="line">(4) 获得指纹:如果左边的像素比右边的更亮,则记录为1,否则为0.</span><br></pre></td></tr></table></figure><p>需要说明的是这种指纹算法不仅可以应用于图片搜索,同样适用于其他多媒体形式。除此之外,图片搜索特征提取方法有很多,很多算法还有许多可以改进的地方,比如对于人物可以先进行人脸识别,再在面部区域进行局部的哈希,或者背景是纯色的可以先过滤剪裁等等,最后在搜索的结果中还可以根据颜色、风景、产品等进行过滤。</p><h5 id="dHash-python-版本:"><a href="#dHash-python-版本:" class="headerlink" title="dHash python 版本:"></a>dHash python 版本:</h5><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_dhash</span>(<span class="params">self,img</span>):</span></span><br><span class="line"> img=cv2.resize(img,(<span class="number">9</span>,<span class="number">8</span>),interpolation=cv2.INTER_AREA) <span class="comment"># step1: 缩小图片:收缩到9*8的大小,它有72的像素点。</span></span><br><span class="line"> img=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) <span class="comment"># step2: 转化为灰度图.</span></span><br><span class="line"><span class="string">"""</span></span><br><span class="line"><span class="string"> step3: 计算差异值:dHash算法工作在相邻像素之间,这样每行9个像素之间产生了8个不同的差异,一共8行,则产生了64个差异值。</span></span><br><span class="line"><span class="string"> """</span></span><br><span class="line"> image=np.zeros([<span class="number">8</span>,<span class="number">8</span>])</span><br><span class="line"> image[np.where(img[:,:<span class="number">-1</span>]>=img[:,<span class="number">1</span>:])]=<span class="number">1</span> <span class="comment"># step4: 获得指纹:如果左边的像素比右边的更亮,则记录为1,否则为0.</span></span><br><span class="line"> image = image.astype(<span class="string">'uint8'</span>)</span><br><span class="line"> image = image.flatten()</span><br><span class="line"> <span class="keyword">return</span> image</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><p>感知哈希算法(Perceptual hash algorithm )是哈希算法的一类,通常用来进行图像相似检索。</p>
<p>常用的有三种:均值哈希(aHash),感知哈希(pHash),差异化哈希(dHash)</p>
<ul>
<li>aHash:平均值哈希。速度比较快,但是常常不太精确。</li>
<li>pHash:感知哈希。精确度比较高,但是速度方面较差一些。</li>
<li>dHash:差异值哈希。Amazing!精确度较高,且速度也非常快。</li>
</ul>
<h3 id="1-基于低频的均值哈希-aHash"><a href="#1-基于低频的均值哈希-aHash" class="headerlink" title="1. 基于低频的均值哈希(aHash)"></a>1. 基于低频的均值哈希(aHash)</h3><p>一张图片就是一个二维信号,它包含了不同频率的成分。如下图所示,亮度变化小的区域是低频成分,它描述大范围的信息。而亮度变化剧烈的区域(比如物体的边缘)就是高频的成分,它描述具体的细节。或者说高频可以提供图片详细的信息,而低频可以提供一个框架。<br><img src="https://cdn.nlark.com/yuque/0/2019/png/283236/1562749582502-2bee275c-a810-4816-8e4f-b74a47495c9c.png#align=left&display=inline&height=215&margin=%5Bobject%20Object%5D&name=&originHeight=215&originWidth=707&size=0&status=done&style=none&width=707"><br>而一张大的,详细的图片有很高的频率,而小图片缺乏图像细节,所以都是低频的。所以我们平时的下采样,也就是缩小图片的过程,实际上是损失高频信息的过程。</p></summary>
<category term="Algorithm" scheme="https://luochenxi.github.io/categories/Algorithm/"/>
<category term="Algorithm" scheme="https://luochenxi.github.io/tags/Algorithm/"/>
</entry>
<entry>
<title>JavaSe备忘录04-封装数据类型</title>
<link href="https://luochenxi.github.io/2017/01/02/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9504-%E5%B0%81%E8%A3%85%E6%95%B0%E6%8D%AE%E7%B1%BB%E5%9E%8B/"/>
<id>https://luochenxi.github.io/2017/01/02/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9504-%E5%B0%81%E8%A3%85%E6%95%B0%E6%8D%AE%E7%B1%BB%E5%9E%8B/</id>
<published>2017-01-01T16:00:00.000Z</published>
<updated>2020-08-25T10:08:52.580Z</updated>
<content type="html"><![CDATA[<h3 id="引用数据类型-封装数据类型"><a href="#引用数据类型-封装数据类型" class="headerlink" title="引用数据类型(封装数据类型)"></a>引用数据类型(封装数据类型)</h3><p>面向对象设计的数据类型;可看成类的方式存在,不能扩展,绝种的,没有子类。</p><ul><li>整形: Byte ,Short, Integger,Long</li><li>浮点型:Float,Double</li><li>字符型: Character</li><li>布尔型:Boolean</li></ul><p><strong>小结</strong>:两个特殊:int—> Integer char —> character, 其他的都是将首字母大写。</p><a id="more"></a><h4 id="比较:"><a href="#比较:" class="headerlink" title="比较:"></a>比较:</h4><ul><li>数组比较,基础数据类型比较如果值相等返回的都是 ture,和数据无关。</li></ul><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 看见字符型 ---> ascii -->看出数字</span></span><br><span class="line"><span class="keyword">int</span> xi =<span class="number">5</span>;</span><br><span class="line"><span class="keyword">float</span> b=<span class="number">5.0f</span>;</span><br><span class="line"><span class="keyword">double</span> c =<span class="number">10d</span>/<span class="number">2</span>;</span><br><span class="line">System.out.println(a==b); <span class="comment">//true</span></span><br><span class="line">System.out.println(a==c); <span class="comment">//true</span></span><br><span class="line">System.out.println(b==c); <span class="comment">//true</span></span><br><span class="line">System.out.println(<span class="string">'0'</span>==<span class="number">48</span>); <span class="comment">//true</span></span><br></pre></td></tr></table></figure><ul><li>引用数据类型和基础数据类型值比较相同的都是 true,比较都是==。</li><li>如果两个引用数据类型比较,那么就不用==而是 equals。</li><li>如果是同类型的引用数据类型比较,还是值比较都用 equals,== 就是内存地址比较。</li></ul><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">Integer aa = <span class="number">10</span>;</span><br><span class="line"><span class="keyword">int</span> bb = <span class="number">10</span>;</span><br><span class="line">Float cc = <span class="number">10.0f</span>;</span><br><span class="line">System.out.println(aa==bb); <span class="comment">//true</span></span><br><span class="line">System.out.println(bb==cc); <span class="comment">//true</span></span><br><span class="line">System.out.println(aa.equals(cc)); <span class="comment">//false</span></span><br><span class="line">Float dd = <span class="number">10.0f</span>;</span><br><span class="line">System.out.println(cc==dd)); <span class="comment">//false 内存地址比较</span></span><br><span class="line">System.out.println(cc.equals(dd)); <span class="comment">//true 值比较</span></span><br><span class="line">System,out.println(cc.equals(a)); <span class="comment">//false equals:同类型为true,不同类型为false.</span></span><br></pre></td></tr></table></figure><h4 id="基础数据类型跟封装数据类型的区别:"><a href="#基础数据类型跟封装数据类型的区别:" class="headerlink" title="基础数据类型跟封装数据类型的区别:"></a>基础数据类型跟封装数据类型的区别:</h4><ul><li>基础数据类型它是一种数据结构而已,它定义都是在<strong>堆内存</strong>中的。</li><li>封装数据类型它是一个类,它有方法和属性,供封装数据调用和转换,提供很多方法可以转换成对应的基础数据类型。</li><li>封装数据类型的原理还是基础数据类型,只不过它是一个常量类型的基础数据类型,通过它的结构函数可以发现它是一个常量。</li></ul><h4 id="和-equals"><a href="#和-equals" class="headerlink" title="==和 equals"></a>==和 equals</h4><p>equals:一定是封装类型比较,类和类的比较。</p><h4 id="封装数据类型比较:不用-而是-equals"><a href="#封装数据类型比较:不用-而是-equals" class="headerlink" title="封装数据类型比较:不用==而是 equals"></a>封装数据类型比较:不用==而是 equals</h4><p>在开发过程中,一般都是用封装数据类型,基础数据类型只是在一些计算和返回的适合用的到。<br>全局变量:方法以外,类以内的变量都是全局变量。<br>在全局变量下:基础数据类型的默认值</p><ul><li>整型:byte, short, int, long. 默认值是 0</li><li>浮点型:float, double 默认是 0.0</li><li>char 默认值\u0000</li><li>boolean 默认值 false</li></ul><p>封装数据类型:默认值 null.因为是类。<br>eg: Integer a = null; // 空引用,在堆中没有开辟空间。</p><ul><li>判断一个数据类型是否为空</li><li>其二,基础数据类型,不能应用于泛型。</li></ul><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">Integer a = <span class="number">10</span>;</span><br><span class="line">Integer b = <span class="number">10</span>;</span><br><span class="line">Float c = <span class="number">10f</span>;</span><br><span class="line">System.out.println(a==b);<span class="comment">//true</span></span><br><span class="line"><span class="comment">// Integer==Integer && 10==10</span></span><br><span class="line">System.out.println(a.equals(b));<span class="comment">//true</span></span><br><span class="line"><span class="comment">// 10 ==10</span></span><br><span class="line">System.out.println(a.equals(c));<span class="comment">//false</span></span><br><span class="line"><span class="comment">/**源码:</span></span><br><span class="line"><span class="comment">public boolean equals(Object obj){</span></span><br><span class="line"><span class="comment">if(obj instanceof Integer) {</span></span><br><span class="line"><span class="comment"> return value == ((Integer)obj).intValue();</span></span><br><span class="line"><span class="comment"> }</span></span><br><span class="line"><span class="comment"> return false;</span></span><br><span class="line"><span class="comment">}</span></span><br><span class="line"><span class="comment">**/</span></span><br></pre></td></tr></table></figure><h4 id="小结:封装数据类型"><a href="#小结:封装数据类型" class="headerlink" title="小结:封装数据类型"></a>小结:封装数据类型</h4><ul><li>同类型比较时,如果数据类型不一致都是 false,反之 true。</li><li>如果同类型比较可以用==,它和 eqauls 是一个意思。</li><li>在开发工程中,封装数据类型的比较都用 eqauls 而不是==。</li><li>在开发中,时刻谨记,如果两个数字都相同,返回还是 false,那么就一定检测数据类型是否一致了。如果一致一定是 true,否则 false。</li><li>String 类型:**== 内存地址比较;eqauls 值比较。**</li></ul><h3 id="拆装箱"><a href="#拆装箱" class="headerlink" title="拆装箱"></a>拆装箱</h3><p>封装数据类型和基础数据类型之间可以相互转换。<br>应用:值注入;数据类型转换等。</p>]]></content>
<summary type="html"><h3 id="引用数据类型-封装数据类型"><a href="#引用数据类型-封装数据类型" class="headerlink" title="引用数据类型(封装数据类型)"></a>引用数据类型(封装数据类型)</h3><p>面向对象设计的数据类型;可看成类的方式存在,不能扩展,绝种的,没有子类。</p>
<ul>
<li>整形: Byte ,Short, Integger,Long</li>
<li>浮点型:Float,Double</li>
<li>字符型: Character</li>
<li>布尔型:Boolean</li>
</ul>
<p><strong>小结</strong>:两个特殊:int—&gt; Integer char —&gt; character, 其他的都是将首字母大写。</p></summary>
<category term="JAVA" scheme="https://luochenxi.github.io/categories/JAVA/"/>
<category term="JAVA" scheme="https://luochenxi.github.io/tags/JAVA/"/>
</entry>
<entry>
<title>JavaSe备忘录05-流程控制</title>
<link href="https://luochenxi.github.io/2017/01/02/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9505-%E6%B5%81%E7%A8%8B%E6%8E%A7%E5%88%B6/"/>
<id>https://luochenxi.github.io/2017/01/02/yuque/JavaSe%E5%A4%87%E5%BF%98%E5%BD%9505-%E6%B5%81%E7%A8%8B%E6%8E%A7%E5%88%B6/</id>
<published>2017-01-01T16:00:00.000Z</published>
<updated>2020-08-25T10:08:52.504Z</updated>
<content type="html"><![CDATA[<h3 id="switch"><a href="#switch" class="headerlink" title="switch"></a>switch</h3><p>switch …case 选择的分支语句,条件过滤</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/***</span></span><br><span class="line"><span class="comment"> * 语法格式:表达式:int,short,char,枚举,byte; jdk1.7以后也支持了string.</span></span><br><span class="line"><span class="comment"> * switch(表达式){</span></span><br><span class="line"><span class="comment"> * case 值1:代码块;break;</span></span><br><span class="line"><span class="comment"> * case 值2:代码块;break;</span></span><br><span class="line"><span class="comment"> * case 值3:代码块;break;</span></span><br><span class="line"><span class="comment"> * case 值4:代码块;break;</span></span><br><span class="line"><span class="comment"> * case 值5:代码块;break;</span></span><br><span class="line"><span class="comment"> * case 值6:代码块;break;</span></span><br><span class="line"><span class="comment"> * default: 代码块;break;</span></span><br><span class="line"><span class="comment"> * }</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"></span><br><span class="line"></span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h3 id="switch"><a href="#switch" class="headerlink" title="switch"></a>switch</h3><p>switch …case 选择的分支语句,条件过滤</p>
<figure class="highlight</summary>
<category term="JAVA" scheme="https://luochenxi.github.io/categories/JAVA/"/>
<category term="JAVA" scheme="https://luochenxi.github.io/tags/JAVA/"/>
</entry>
</feed>