Data Mining - Finding Answers in an Ocean of Data




The above edge-processed data shows highlighted in red the individual printer identification numbers with a camera defect, and the below edge-processed data shows highlighted in red the printers with a critical printhead cooling agent defect. Both of these cases, and over 30 more, were processed automatically on the edge out of a population of over 600 printers with more than 9000 print jobs and growing.


Using an Edge Based Federated Machine Learning

I drove the development of an edge-based data analytics system for two primary reasons: 1) the data requiring processing was burdensome at over 5GB for each print cycle, and 2) major components of the data could not be warehoused or processed outside of the customer install site due to privacy and confidentiality agreements.

Because I realized this early in the product development cycle I was able to put in the hooks to process the data on the edge automatically so the performance and health of the printers at customer sites could be monitored digitally. I worked with a team of 5 other engineers to coordinate with the extended team of subsystem experts to translate their knowledge into algorithms to run onboard. With the system in place reporting actionable metrics it is now not only possible to determine problem root cause pro-actively, but also to better combine and prioritize service visits in an operationally efficient and thus cost-effective manner.

With the system running during product release, in less than one year over 9000 customer builds have been processed corresponding to over 45 Terabytes of data. Below is an example rich detailed contextual image showing printhead cooling agent system malfunction. By taking early action the customer will experience higher part quality and more profitable production yields.