论文阅读 (三)：An empirical study on image bag generators for multi-instance learning (2016)人工智能因吉的博客-

03 五月

星期日, 03 五月 2020 16:10 Last Updated on 星期日, 03 五月 2020 16:10 0 Comments

文章目录

1.1 Row

1.2 SB

1.3 SBN

2 支持代码

前言

数据的生成是门艺术，文章地址：
https://link.springer.com/article/10.1007/s10994-016-5560-1

摘要

要点如下：

对九种包生成器 (bag generators)进行了比较学习，即Row¹, SB¹, SBN¹, Blobworld²,
$k$
-meansSeg³, WavSeg⁴, JSEG-bag⁵, LBP⁶和SIFT⁷。
结论：
2.1 采用密度采样 (dense sample)策略的包生成器效果更优；
2.2 标准多示例假设不适用于图像分类任务 (这句话存疑)。

1 包生成器

根据包生成器是否可以区分图像的语义成分 (semantic components)，将其分为non-segmentation 包生成器和segmentation 包生成器。
1）non-segmentation 包生成器：Row, SB, SBN；
2）segmentation 包生成器：Blobworld,
$k$

-meansSeg, WavSeg, JSEG-bag；
3）不属于以上，即local descriptors：LBP6, SIFT。
简单说来，non-segmentation就是划分方式与图像无关；local descriptors用于计算机视觉中描述某区域外观或形状的不同特征。

1.1 Row

简单说来就是一行一个实例，包的大小与重设图像大小的行数呈线性相关。

1.1.1 详细步骤

1）给定任意一张图片，本文选择的是COREL数据源中的Tiger数据集。
论文阅读 (三)：An empirical study on image bag generators for multi-instance learning (2016)人工智能因吉的博客-
2）滤波，‘mean’, ‘Gaussian’, ‘median’, ‘bilateral’四种滤波的结果如下，此处默认选择Gaussian滤波：

3）更改图像大小，默认设置为
$8 times 8$

：
论文阅读 (三)：An empirical study on image bag generators for multi-instance learning (2016)人工智能因吉的博客-
4）计算每行的平均RGB，这里的结果和MATLAB中的运行结果有些许差别，原因猜测为更改图像大小两者的参数不一致：

[[ 25.375 37.125 36.875] [ 24.125 41.5 37.75 ] [ 60.375 67.625 50.625] [102.375 89.875 65.25 ] [115.875 105.25 84.125] [105.125 93.125 75.125] [ 82.625 83. 67.875] [ 58.5 72.25 65.375]]

5）包生成，记为
$M_{row} = mathbb{R}^{8 times 9}$

，其中
$9$

是一个定数。
$M_{row}$

的前三列就是4)中的计算结果；然后中间三列是该行减去上一行；后三列是该行减去下一行，不够怎么办？把图片看成一个循环队列就行：

[[ 25.375 37.125 36.875 -33.125 -35.125 -28.5 1.25 -4.375 -0.875] [ 24.125 41.5 37.75 -1.25 4.375 0.875 -36.25 -26.125 -12.875] [ 60.375 67.625 50.625 36.25 26.125 12.875 -42. -22.25 -14.625] [102.375 89.875 65.25 42. 22.25 14.625 -13.5 -15.375 -18.875] [115.875 105.25 84.125 13.5 15.375 18.875 10.75 12.125 9. ] [105.125 93.125 75.125 -10.75 -12.125 -9. 22.5 10.125 7.25 ] [ 82.625 83. 67.875 -22.5 -10.125 -7.25 24.125 10.75 2.5 ] [ 58.5 72.25 65.375 -24.125 -10.75 -2.5 34.375 30.75 27.625]]

6）归一化：

[[0.42676168 0.50118765 0.49960412 0.05621536 0.04354711 0.08551069 0.27395091 0.23832146 0.26049089] [0.41884402 0.52889945 0.50514648 0.2581156 0.29374505 0.27157561 0.03642122 0.10055424 0.18448139] [0.64845606 0.69437846 0.58669834 0.49564529 0.43151227 0.34758511 0. 0.12509897 0.17339667] [0.91448931 0.83531275 0.67933492 0.53206651 0.40696754 0.35866983 0.18052257 0.16864608 0.14647664] [1. 0.93269992 0.79889153 0.35154394 0.36342043 0.38558987 0.3341251 0.34283452 0.32304038] [0.93190816 0.85589865 0.7418844 0.19794141 0.18923199 0.20902613 0.40855107 0.33016627 0.31195566] [0.78939034 0.79176564 0.695962 0.12351544 0.20190024 0.22011085 0.41884402 0.3341251 0.28186857] [0.63657957 0.72367379 0.68012668 0.11322249 0.19794141 0.25019794 0.4837688 0.4608076 0.44101346]]

1.1.2 完整代码

注意：所有代码均需要支持代码，即引入SimpleTool.py文件，图片路径也需要相应调整。

''' ''' @(#)The bag generators Author: inki Email: inki.yinji@qq.com Created on May 01, 2020 Last Modified on May 03, 2020 ''' import SimpleTool import numpy as np import warnings warnings.filterwarnings('ignore')  __all__ = ['Row'] def introduction(__all__=__all__):     SimpleTool.introduction(__all__) def Row(file_path='D:/program/Java/eclipse-workspace/Python/data/image/1.jpg', blur='Gaussian', resize=8): """     :param blur: 'mean', 'Gaussian', 'median', 'bilateral', the default setting is 'Gaussian'            resize: The size of the image after the representation, the default setting is 8.     :return: The mapping instances of a image (bag).     """     temp_pic = SimpleTool.read_pic(file_path)     temp_pic = SimpleTool.blur(temp_pic, blur)  temp_pic = SimpleTool.resize_pic(temp_pic, resize) # SimpleTool.show_pic(temp_pic) """Calculate the mean color of each row"""     temp_num_row = temp_pic.shape[0]     temp_num_column = temp_pic.shape[1]     temp_row_mean_RGB = np.zeros((temp_num_row, 3)) # The size is row times column. for i in range(temp_num_row):         temp_row_mean_RGB[i][0] = sum(temp_pic[i, :, 0]) / temp_num_column         temp_row_mean_RGB[i][1] = sum(temp_pic[i, :, 1]) / temp_num_column         temp_row_mean_RGB[i][2] = sum(temp_pic[i, :, 2]) / temp_num_column          """Generate the bag""" """First step: the first row."""     ret_bag = np.zeros((temp_num_row, 9)) # The size is row times 9.     ret_bag[: , : 3] = temp_row_mean_RGB  # Current row.     ret_bag[0, 3 : 6] = temp_row_mean_RGB[0] - temp_row_mean_RGB[-1] # Row above.     ret_bag[0, 6 :] = temp_row_mean_RGB[0] - temp_row_mean_RGB[1] # Row below. """Second step: remove the first and last rows.""" for i in range(1, temp_num_row - 1):         ret_bag[i, 3 : 6] = temp_row_mean_RGB[i] - temp_row_mean_RGB[i - 1]         ret_bag[i, 6 :] = temp_row_mean_RGB[i] - temp_row_mean_RGB[i + 1] """Three step: the last row."""     ret_bag[-1, 3 : 6] = temp_row_mean_RGB[-1] - temp_row_mean_RGB[-2] # Row above.     ret_bag[-1, 6 :] = temp_row_mean_RGB[-1] - temp_row_mean_RGB[1] return SimpleTool.normalize(ret_bag) if __name__ == '__main__':     row_bag = Row() print(row_bag)

1.2 SB

Row是一行一行的转换图片，SB (Single Blob with no neighbors)则是用一个
$4$

像素的小格子去扫描图片上的区域，并将其转换。如下图 (来源于原论文)：
论文阅读 (三)：An empirical study on image bag generators for multi-instance learning (2016)人工智能因吉的博客-

1.2.1 详细步骤

1）滤波并重设大小；
2）避免出现行数或者列数不为偶数的情况；
3）按四像素小块生成实例：

 [ 30. 42. 42. 94. 87. 69. 29. 46. 40. 94. 89. 71.] [ 47. 57. 49. 120. 116. 97. 34. 46. 48. 81. 87. 77.] [ 42. 55. 52. 76. 87. 79. 49. 69. 68. 60. 82. 79.] [ 21. 31. 37. 19. 30. 37. 23. 44. 39. 19. 37. 39.] [ 79. 80. 61. 65. 73. 57. 133. 113. 88. 137. 109. 82.] [115. 114. 95. 133. 113. 89. 91. 90. 77. 156. 110. 89.] [ 79. 86. 76. 101. 95. 83. 57. 78. 73. 69. 87. 79.] [ 22. 36. 37. 30. 44. 38. 22. 43. 39. 26. 45. 38.] [ 62. 70. 50. 56. 65. 49. 153. 107. 73. 150. 122. 87.] [181. 149. 116. 190. 164. 125. 149. 116. 93. 175. 153. 118.] [120. 107. 89. 114. 100. 75. 77. 88. 78. 73. 70. 58.] [ 37. 49. 39. 31. 44. 34. 31. 50. 38. 29. 45. 35.] [ 52. 65. 40. 45. 59. 37. 77. 70. 43. 46. 63. 38.] [ 91. 78. 60. 50. 51. 42. 97. 87. 59. 58. 56. 40.] [ 71. 73. 47. 58. 61. 42. 45. 51. 42. 38. 53. 46.]]

4）归一化：

[[0.00584795 0.06432749 0.10526316 0.02339181 0.08187135 0.0994152 0.00584795 0.06432749 0.0994152 0.02339181 0.11111111 0.11111111] [0.06432749 0.13450292 0.13450292 0.43859649 0.39766082 0.29239766 0.05847953 0.15789474 0.12280702 0.43859649 0.40935673 0.30409357] [0.16374269 0.22222222 0.1754386 0.59064327 0.56725146 0.45614035 0.0877193 0.15789474 0.16959064 0.3625731 0.39766082 0.33918129] [0.13450292 0.21052632 0.19298246 0.33333333 0.39766082 0.35087719 0.1754386 0.29239766 0.28654971 0.23976608 0.36842105 0.35087719] [0.01169591 0.07017544 0.10526316 0. 0.06432749 0.10526316 0.02339181 0.14619883 0.11695906 0. 0.10526316 0.11695906] [0.35087719 0.35672515 0.24561404 0.26900585 0.31578947 0.22222222 0.66666667 0.5497076 0.40350877 0.69005848 0.52631579 0.36842105] [0.56140351 0.55555556 0.44444444 0.66666667 0.5497076 0.40935673 0.42105263 0.41520468 0.33918129 0.80116959 0.53216374 0.40935673] [0.35087719 0.39181287 0.33333333 0.47953216 0.44444444 0.37426901 0.22222222 0.34502924 0.31578947 0.29239766 0.39766082 0.35087719] [0.01754386 0.0994152 0.10526316 0.06432749 0.14619883 0.11111111 0.01754386 0.14035088 0.11695906 0.04093567 0.15204678 0.11111111] [0.25146199 0.29824561 0.18128655 0.21637427 0.26900585 0.1754386 0.78362573 0.51461988 0.31578947 0.76608187 0.60233918 0.39766082] [0.94736842 0.76023392 0.56725146 1. 0.84795322 0.61988304 0.76023392 0.56725146 0.43274854 0.9122807 0.78362573 0.57894737] [0.59064327 0.51461988 0.40935673 0.55555556 0.47368421 0.32748538 0.33918129 0.40350877 0.34502924 0.31578947 0.29824561 0.22807018] [0.10526316 0.1754386 0.11695906 0.07017544 0.14619883 0.0877193 0.07017544 0.18128655 0.11111111 0.05847953 0.15204678 0.09356725] [0.19298246 0.26900585 0.12280702 0.15204678 0.23391813 0.10526316 0.33918129 0.29824561 0.14035088 0.15789474 0.25730994 0.11111111] [0.42105263 0.34502924 0.23976608 0.18128655 0.1871345 0.13450292 0.45614035 0.39766082 0.23391813 0.22807018 0.21637427 0.12280702] [0.30409357 0.31578947 0.16374269 0.22807018 0.24561404 0.13450292 0.15204678 0.1871345 0.13450292 0.11111111 0.19883041 0.15789474]]

1.2.2 完整代码

''' @(#)The bag generators Author: inki Email: inki.yinji@qq.com Created on May 01, 2020 Last Modified on May 03, 2020 ''' import SimpleTool import numpy as np import warnings from numpy import reshape warnings.filterwarnings('ignore')  __all__ = ['SB'] def SB(file_path='D:/program/Java/eclipse-workspace/Python/data/image/1.jpg', blur='Gaussian', resize=8): """     :param blur: 'mean', 'Gaussian', 'median', 'bilateral', the default setting is 'Gaussian'            resize: The size of the image after the representation, the default setting is 8.     :return: The mapping instances of a image (bag).     """     temp_pic = SimpleTool.read_pic(file_path)     temp_pic = SimpleTool.blur(temp_pic, blur)     temp_pic = SimpleTool.resize_pic(temp_pic, resize) """Avoid this case that the row numbers or column numbers is not even."""     temp_num_row = temp_pic.shape[0]     temp_num_column = temp_pic.shape[1] if temp_num_row % 2 == 1:         temp_num_row -= 1 if temp_num_column % 2 == 1:         temp_num_column -= 1 """In order to reduce the complexity of sampling; why 12? RGB = 3, and four blob."""     temp_bag = np.zeros((int(temp_num_row / 2), int(temp_num_column / 2), 12)) for i in range(0, temp_num_column - 1, 2): for j in range(0, temp_num_row - 1, 2):             temp_bag[int((i + 1) / 2), int((j + 1) / 2), : 3] =  temp_pic[i, j] # 1-st blob             temp_bag[int((i + 1) / 2), int((j + 1) / 2), 3 : 6] =  temp_pic[i, j + 1] # 2-st blob             temp_bag[int((i + 1) / 2), int((j + 1) / 2), 6 : 9] =  temp_pic[i + 1, j] # 3-st blob             temp_bag[int((i + 1) / 2), int((j + 1) / 2), 9 :] =  temp_pic[i + 1, j + 1] # 4-st blob for i in range(12):         temp_bag[:, :, i] = temp_bag[:, :, i].T     temp_bag = temp_bag.reshape(int(temp_num_row * temp_num_column / 4), 12) return SimpleTool.normalize(temp_bag) if __name__ == '__main__':     bag = SB() print(bag)

1.3 SBN

SBN来的陡了些，用下面的格子来扫描：
论文阅读 (三)：An empirical study on image bag generators for multi-instance learning (2016)人工智能因吉的博客-

1.3.1 详细步骤

1.3.2 完整代码

2 支持代码

''' @(#)SimpleTool.py The class of test. Author: inki Email: inki.yinji@qq.com Created on March 05, 2020 Last Modified on May 03, 2020 ''' # coding = utf-8 import numpy as np import warnings import matplotlib.pyplot as plt import matplotlib.image as mpimg warnings.filterwarnings('ignore')  __all__ = ['blur', 'index_select_datas', 'normalize', 'read_pic', 'read_file', 'resize_pic', 'show_pic'] def introduction(__all__=__all__):     _num_function = 0 print("The function list:") for temp in __all__:         _num_function = _num_function + 1 print("%d-st: %s" % (_num_function, temp)) def blur(pic, blur='Gaussian'): import cv2     """Image filtering""" if blur == 'mean':         ret_pic = cv2.blur(pic, (3, 3)) elif blur == 'Gaussian':         ret_pic = cv2.GaussianBlur(pic, (3, 3), 0.5) elif blur == 'median':         ret_pic = cv2.medianBlur(pic, 3) elif blur == 'bilateral':         ret_pic = cv2.bilateralFilter(pic, 9, 75, 75) else: print("Error: there hava not " + blur + ", and you will get a default setting, i.e., Gaussian blur in the BagGenerator.row().") return ret_pic  def index_select_datas(datas, index):     temp_data = [] for i in index:         temp_data.append(datas[i]) return temp_data  def read_pic(file_path='D:/program/Java/eclipse-workspace/Python/data/image/1.jpg', is_show=False, is_axis=False):     return_pic = mpimg.imread(file_path) if is_show: if not is_axis:             plt.axis('off')         plt.imshow(return_pic)         plt.show() return return_pic  def read_file(file_path): """load file, return data""" with open(file_path) as fd:         fd_datas = fd.readlines() return fd_datas  def resize_pic(pic, resize=8): """Resize""" import scipy.misc as misc     return misc.imresize(pic, (resize, resize)) def normalize(data): """The source: """     _max = np.max(data)     _min = np.min(data)     data = (data - _min) / (_max - _min); return data  def show_pic(pic, is_axis=False): if not is_axis:         plt.axis('off')     plt.imshow(pic)     plt.show()     plt.close() if __name__ == '__main__':     a = list(range(10))     b = list(range(10, 20))     datas = np.array([a, b]).T     print(index_select_datas(datas, [0, 2]))

Maron, O., & Ratan, A. L. (2001). Multiple-instance learning for natural scene classification. In Proceedings of 18th international conference on machine learning. Williamstown, MA, pp. 425–432. ↩︎ ↩︎ ↩︎
Carson,C.,Belongie,S.,Greenspan,H.,&Malik,J.(2002).Blobworld: Image segmentation using expectation- maximization and its application to image querying.IEEE Transaction son Pattern Analysis and Machine Intelligence, 24(8), 1026–1038. ↩︎
Zhang,Q.,Goldman,S.A.,Yu,W.,&Fritts,J.E.(2002).Content-based image retrieval using multiple-instance learning. In Proceedings of 19th international conference on machine learning. Sydney, Australia, pp. 682–689. ↩︎
Zhang,C.C.,Chen,S.,&Shyu,M.(2004).Multiple object retrieval for image data bases using multiple instance learning and relevance feedback. In Proceedings of IEEE international conference on multimedia and expo. Sydney, Australia, pp. 775–778. ↩︎
Liu, W., Xu, W. D., Li, L. H., & Li, G. L. (2008). Two new bag generators with multi-instance learning for image retrieval. In Proceedings of 3rd IEEE conference on industrial electronics and applications. Singapore, pp. 255 – 259. ↩︎
Ojala, T., Pietikäinen, M., & Mäenpää, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns.IEEE Transaction son Pattern Analysis and Machine Intelligence, 24(7), 971–987. ↩︎
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110. ↩︎

因吉

原创文章 35获赞 44访问量 8569

关注私信

展开阅读全文

1
评论
x
海报

扫一扫，海报
手机看

到微信朋友圈

x

扫一扫，手机阅读
打赏

打赏

因吉

“你的鼓励将是我创作的最大动力”

5C币 10C币 20C币 50C币 100C币 200C币

确定

本页所有内容来自官方网站 https://www.imapbox.com 新闻来源：互联网搜索引擎和新闻站

本网页所有图片由 ImageBox 图片批量下载器,网页图片批量下载专家,网页图片批量下载器,获取到文章图片，下载并得到。

ImageBox 图片批量下载器工具地址: 网页图片批量下载工具-最新版本下载

非凡下载站地址：https://www.crsky.com/soft/35838.html

本网页所有视频内容由 imoviebox边看边下-网页视频下载, iurlBox网页地址收藏管理器下载并得到。

ImovieBox网页视频下载器下载地址: ImovieBox网页视频下载器-最新版本下载

本文章由: imapbox邮箱云存储,邮箱网盘,ImageBox 图片批量下载器,网页图片批量下载专家,网页图片批量下载器,获取到文章图片,imoviebox网页视频批量下载器,下载视频内容,为您提供.

阅读和此文章类似的: 全球云计算

论文阅读 (三)：An empirical study on image bag generators for multi-instance learning (2016)人工智能因吉的博客-

文章目录

前言

摘要

1 包生成器

1.1 Row

1.1.1 详细步骤

1.1.2 完整代码

1.2 SB

1.2.1 详细步骤

1.2.2 完整代码

1.3 SBN

1.3.1 详细步骤

1.3.2 完整代码

2 支持代码

文章目录

近期文章

官方链接

关于我们

软件产品

事业方向

联系我们

ImapBox Technology Research Group

论文阅读 (三)：An empirical study on image bag generators for multi-instance learning (2016)人工智能因吉的博客-

文章目录

前言

摘要

1 包生成器

1.1 Row

1.1.1 详细步骤

1.1.2 完整代码

1.2 SB

1.2.1 详细步骤

1.2.2 完整代码

1.3 SBN

1.3.1 详细步骤

1.3.2 完整代码

2 支持代码

文章目录

近期文章

官方链接

关于我们

软件产品

事业方向

联系我们

ImapBox Technology Research Group

登录