Frontline workers hate filling in forms. Wizy makes the case for image-centric apps

Phil Wainewright Profile picture for user pwainewright January 31, 2022
Most mobile apps for frontline workers come from a paper-based legacy. Startup Wizy believes these deskless workers deserve an image-centric alternative.

Wizy co-founders Laurent Gasser & Louis Nauges - screengrab from Zoom call
(screengrab from Zoom call)

The 2.7 billion workers who work on their feet rather than at a desk are seen as a huge global market opportunity for enterprise collaboration tools such as Microsoft Teams, Google Workspace and Workplace from Meta (formerly Facebook). But these products' office-based and text-centric origins make them fundamentally unsuited to the needs of these workers, according to the founders of Wizy. The French startup has developed an image-centric database and no-code toolkit that allows enterprises to build mobile apps that it believes are a better fit for the workflows of these frontline roles. Louis Naugès, Wizy's chief strategy officer, says:

[These workers are] 80% of the population, 20% of the IT investments — a ratio [per employee] of one to 16 in dollar, pounds or euro spend for frontline workers. I think we could spend a little more. [But] most of these people don't use, don't like, and will not use office tools like PCs, keyboards, mouses — it doesn't work [for them].

Wizy's bread-and-butter product is an easy-to-deploy Enterprise Mobile Management (EMM) for Android phones that is finding traction in particular in retail and logistics. But its flagship offering is WizyVision, a set of software tools for building image-centric applications to use on those phones, automating processes such as reporting issues or recording work done. Rather than forcing frontline workers to stop what they're doing to type in information, these tools let them capture images, video and audio as they work. At the core is a database designed for handling visual objects, which makes it a unique offering, according to Naugès:

The foundation is a Digital Asset Centre [DAC]. It's a database for image, videos and content. And it is today, to my knowledge, the only professional, independent database, specialized for images on the cloud.

The database, which runs on Google Cloud Platform and uses the Google Vision API, can either be accessed as a service by existing applications directly through an API, or Wizy provides a visual process builder called Frontspace for building image-centric workflows that leverage the DAC. The final component of the platform is ML Studio, a no-code tool for creating image recognition models using Google's machine learning capabilities. Unlike a Digital Asset Management (DAM) platform, which organizations use to manage libraries of marketing and product images, the WizyVision DAC targets a much broader range of enterprise activities, in particular those relating to operations and assets.

Start from the image

The aim is to build a new generation of mobile applications that start from intelligent image processing, rather than the historic metaphor of filling in a paper-based form. Laurent Gasser, Wizy's CEO, explains:

Historically, process has gone from paper to forms, and from web forms to mobile forms. But anyways, there was a form you complete with information. We think it should change.

You should start with the picture, extract information from the picture, and then start your form, or start your process, already pre-completed with 50% of the data in it. That's really the change of paradigm we're pushing with Frontspace.

Many frontline workers already use smartphones to record images as part of their work — for example to report a problem, log an asset, or record a finished job. But they typically do that within applications that aren't purpose-built to handle images as part of an enterprise workflow. Either they're sending images to colleagues over a consumer app such as WhatsApp, which operates outside of enterprise processes, storage and permissions. Or they upload images as an attachment to a traditional text-centric application, which doesn't have the functionality to process much of the valuable information associated with the image. Wizy contends that both these approaches are a waste of opportunities, not only to save time and improve accuracy and compliance by automating data collection, but also to build up an enterprise image collection for machine learning and other data analytics. There's a lot of information in an image which could be useful when part of a searchable database, as Gasser explains:

What we say [is], the picture should be not only centric to starting the process, to save time to the person, but also should be stored in a database, a digital asset center, which is basically a full database with customizable privacy, access rights, search, etc. Then you can work on this, to leverage your data. A lot of companies, for example, want to run models to do machine learning, image recognition, but they don't have the pictures.

So we collect pictures inside processes that can be reused after, for doing statistics about problems, for doing machine learning, [or] for geolocation — finding assets.

Enterprise use cases

Wizy's customers are typically in industries that have to keep tabs on physical infrastructure or track items moving through a process. These include logistics, transport, retail, hospitality and manufacturing. The sweet spot is where organizations have non-core processes that currently require frontline workers to complete forms with 20-30 fields of information, and which can be carried out much faster with image processing. Gasser explains:

They have a lot of side processes, which are a pain in the neck right now, which are not solved, and which can be solved by images. We add a lot of value on [these] processes.

Some of the earliest enterprise use cases among Wizy's existing customers have involved identifying assets or recording their condition by uploading pictures. Any text in the image, for example product names, serial numbers or delivery tags are automatically read, objects in the pictures are recognized and tagged, and the location and time are recorded. Workers can add comments or descriptions by using the voice-to-text option on their phone. Other fields and tags can be automatically applied based on location or role. ML Studio also allows an organization to create custom image tags by training a model based on its own image library, or it can use the API to connect WizyVision to an existing model.

In one example, a parcel carrier has been able to improve its ability to trace items in its distribution network that have gone astray after being separated from their tracking ID — images of the item and any markings or labels are uploaded and read, instantly creating a searchable database of identifying characteristics. A restaurant chain has built a visual inventory of all the printers installed at each of its locations, simplifying asset tracking, support and maintenance. A non-profit organization created an app to record visual proof of delivery for respirators donated to remote hospitals in India. Other examples include recording readings from meters and gauges, tracking the condition of assets over time, recording proof of work carried out, or recording and automating compliance and quality checks.

Wizy is still early in its journey, having completed its first couple dozen projects, but it already has more than 30 engineers working on four continents, with people in Australia, Singapore, the Philippines and Spain, as well as France. Gasser and Naugès have a long history of collaborating with Google, having previously co-founded early Google Apps integrator Revevol.

My take

Mobile technology is bringing connected digital processes to workers and locations that have been neglected by earlier generations of enterprise computing. It's about time these frontline workers had the support of technology to help them in their work, which is often fundamental to the operations of the business. We saw this writ large during the pandemic, when healthcare staff, delivery drivers, field maintenance engineers and essential retail workers kept things running, while office workers sheltered in place at home. Now that the economy is starting to return to normal, there's an increased appreciation of their role, in particular as curators of the in-person customer experience.

But all too often, the digital processes they're offered are still tainted by the paper-based, office-centric operating models of those earlier generations. Enterprise IT has a built-in historical assumption that the data to be captured consists of numbers and text, and that the person entering that data will use a keyboard and mouse to navigate the screen as they do so. As Wizy's founders recognize, these assumptions do a disservice both to the workers themselves and to the extensive capabilities of the available technology. A modern smartphone biometrically identifies its user, can report its geolocation to within 10 meters (33 feet), captures high-resolution images, and natively supports speech-to-text. When connected into an enterprise app that knows the user's role, work locations and other parameters, most information can be autofilled and much of the rest will be implicit in the images the user uploads. App designers need to forget the old form analogy and reorient the information gathering around the new paradigm of a handheld, multi-sensor, connected device.

This is a potent example of how radically we need to rethink how things are done when implementing connected digital technologies. When I wrote about the principles of Frictionless Enterprise recently, I cited moving on from paper as a key principle, as well as well as being prepared to unbundle old processes and repackage them in new configurations. Another principle I cited is breaking out of silos. I think we need to be prepared to recognize that, for many roles, the very notion of the office is itself a constricting silo. It exists because that's where the paper used to be passed around, and in the early days of computing, it's where the network was. Those limitations no longer exist and today's technology allows work to be done wherever it is best done. Perhaps in a few decades' time, the office itself will be obsolete. It already is for frontline workers and others whose main work is done on their feet.

In my review of 2021, I made the point that "the data-driven enterprise isn't just about text data — visual and audio data will also be important." At the time, I was thinking ahead to this article, which I think illustrates the point. In the meantime, I've also written about the growing importance of visual media in digital engagement and an example of AI-powered analysis of video streams for industrial security and safety. Taken together, these stories underline the pace at which visual data is rising in use, and enterprises need to take it seriously in their IT strategies.

A grey colored placeholder image