Microsoft researchers have developed a “garment segmentation tool” using the Tiramisu deep learning architecture, which can effectively identify clothing items photographed on a smartphone, even against chaotic backgrounds.
The visual search development is likely to be welcomed by the booming online retail industry – for which inventory management is a significant expense: matching catalogue items with physical items in a warehouse is error-prone and time-consuming.
Updating a catalogue throws up similar challenges in terms of managing large collections of photographs, avoiding duplicates and managing similar items.
The ability to rapidly update stock data has become crucial for the fashion industry in recent years, as it moves away from traditional fashion “seasons”, to almost weekly stock updates, or “fast fashion”; driven by algorithmic analysis of social media.
The blog, published last week by Microsoft developers CY Yam, Patty Ryan and Elena Terenzi described the development, and detailed collaboration earlier this year with an unnamed “successful international online fashion retailer” to trial the technology.
They wrote: “Our aim was to work with this retailer to design an algorithm capable of identifying whether a newly arrived item was in stock using only a mobile phone image of the new item as a reference. If the retailer’s staff could snap a photo of a new arrival and use our solution to search their catalog of studio images for matches, we could eliminate the cost of errors and wasted time.”
The added: “Such a retrieval algorithm could allow our partner and other online retailers of all kinds to better manage their inventory – what’s more, the same solution might also assist in the development of a powerful search and retrieval tool.”
Content-based image retrieval techniques often find performance hindered by the difference between the visual environment of the image and its target match.
As the developers noted: “These differences are introduced due to the busy background of one image compared with the clean background of a studio image, inconsistent folding or creases in the apparel, lighting differences, scale and point of view angle differences… In other words, in applying our content-based image retrieval solution, we had to find a way to make sure the mobile images’ busy backgrounds didn’t confuse our retrieval algorithm.”
After trying and rejecting GrabCut – an iterative segmentation method that requires the user to specify a bounding box around the region of interest – they settled on the use of Tiramisu, a type of DenseNets, a recently proposed type of convolutional architecture, that can be trained from scratch with a small data set to achieve high levels of image recognition accuracy (94 percent accuracy on this street scene training set.)
While they ran into a few hiccups (“we found the model was sensitive to initialization and, generally speaking, it learned very slowly… in a few examples where the human labeling was not perfect the model partially reproduces the same error. It seems like the network has not only ‘learned’ to segment, but it may have also ‘learned’ the human labeling error”) ultimately the garment segmentation tool proved quite capable of segmenting photograph foreground from background, and allowing us to eliminating the query image’s busy background, they concluded.
(They even shared the code in this GitHub repo.)
AI in Retail
Microsoft is not alone in looking at the use of machine learning and AI in retail.
Intel’s Director of Business Development, (AI Products Group) Azadeh Yazdan has written about her vision for an “Autonomous Store” — one that is able to operate smartly on its own, constantly informed by cloud-based intelligence.
The chip giant last year also announced plans to invest more than $100 million over the next five years in the retail industry.
The investment will go toward enabling retailers to unify “every part” of the retail operation, supporting Intel’s wider efforts to integrate Internet of Things and other technologies into retail – from inventory management to checkout – through its “broad solution portfolio and rich ecosystem of solution providers”, the company said.