CASE STUDY
PipeSearch
A Digital Trading Platform for Quality Pipe and Tube
Transforming global piping and tubular trade on AWS
Industry
Oil & Gas, Manufacturing, Steel Piping
Challenge
Develop machine learning pipeline to improve inventory identification and overall steel piping and tube trade experience on platform, increasing time and cost savings for customers
Services & Tech
IoT, AI and Machine Learning
CRA piping analytics & trading platform provider
The Challenge
In their process of creating an inventory search and trade solution, PipeSearch found that there was no broadly accepted format for data presentation or inventory description. Each inventory holder uses various methods, including storing information in spreadsheets, PDF files, or CSV files. Pipe inventory does not have a unique identifier or SKU but made of combinations of three-ten attributes or descriptors that are non-standard and often synonymous with other descriptors. With hundreds of different suppliers and manufacturers forming the industry, this situation grows increasingly complex.
Furthermore, to build their envisioned platform, PipeSearch needed a higher degree of knowledge and expertise around machine learning (ML) and started looking for a skilled and experienced partner that could understand and realize their vision.
After attending an Onica machine learning workshop, PipeSearch realized that Onica, a Premier Partner in the Amazon Web Services (AWS) Partner Network (APN), was best equipped in skill-set and experience to help them achieve their goals. With extensive experience in Machine Learning and expertise with AWS, it was an easy decision to partner with Onica to help accomplish PipeSearch’s goals.
The Solution
To demonstrate the utility of the machine learning model, PipeSearch and Onica decided to develop a pipeline for how to deliver the model to production. Onica helped PipeSearch build a machine learning model that would identify and normalize specific attributes and descriptors for different inventory items, streamlining inventory identification from its current unstructured state.
The data flow plan designed by Onica involved dropping a file into an Amazon S3 bucket and having an AWS Lambda function trigger a machine learning process. The extracted data would then be fed into a searchable database and provided with normalized descriptions, allowing easy access to customers.
Natural language processing was the first proposed solution since the source data was textual information. After spending time deriving annotations from existing data, PipeSearch was able to offer millions of records of training data. One challenge with this process was that natural language processing models are meant for spoken languages, and training the model with the source data would be like teaching it a new language. Despite the novelty of this solution, Onica’s machine learning experts chose to implement a blank Named Entity Recognition (NER) model, free of NLP biases. They were able to monitor and identify the model’s success and failures, learning rapidly and remodeling as needed to achieve the desired solution.
The team started by stripping the source data of extraneous information and building a model that would identify appropriate phrases or words from the text, with an acceptable level of certainty. Onica sampled ten million examples of raw data, and trained the model with a sample size of 100,000 examples in a week, using randomization to represent all the data appropriately.
Outcomes
The team achieved greater than 80% confidence on average in the results within the first week. The solution also allowed the team to identify data they were unsure of, which they used to retrain the model, further strengthening results.
PipeSearch implemented the final solution with Amazon SageMaker, a fully managed service by AWS that provides everything required to build, train, and deploy ML applications.
Through a thorough understanding of the problem, and fast work with a functional solution, Onica helped PipeSearch extract normalized inventory information from raw data. This solution helped PipeSearch expand its platform capabilities and offer customers a more streamlined way of understanding market options for steel pipe beyond the mills.
WHY US
Why Onica
Onica is one of the largest and fastest-growing Amazon Web Services (AWS) Premier Consulting Partners in the world, helping companies enable, operate, and innovate in the cloud. From migration strategy to operational excellence and immersive transformation, Onica is a full spectrum AWS integrator. Learn more at www.onica.com.