Chatting up artificial intelligence

Madison-based DataChat is working to make data science technologies more accessible and understandable for users.
0723 Ec Biz Report Opener
Democratizing data science for everyone means presenting data in ways that make it easy for users to understand.

Madison startup DataChat is speaking consumers’ language when it comes to data science innovations. Its latest development, an AI tool called Ask, will attempt to transform the way companies solve data science problems and help them glean new insights through user-friendly data analysis.

Co-founder and CEO Jignesh Patel, who will be joining the Computer Science Department at Carnegie Mellon University this fall (DataChat will remain headquartered in Madison), and co-founder Rogers Jeffrey Leo John found inspiration for DataChat in 2017 when they noticed persistent problems in how data science was practiced. According to Patel, many programmers were using online resources to translate their intent from natural language into code, one block at a time, when solving problems. They added these blocks to what he calls a “notebook” and then made adjustments so the code would suit their context.

“All of this seemed tremendously wasteful,” Patel says. “We thought, ‘what if we could directly generate the code in the notebook when the programmer [typed] in natural language?’”

Even as DataChat launched and began developing its initial working prototype, Ava, the company was forced to wait for critical large language model (LLM) technology, the underlying power behind ChatGPT, to become available. Luckily, it did.

“Everyone told us that this was a crazy idea,” Patel says, “and they were right. It was. But we kept working on it and were lucky to find investors from the Bay Area who were willing to support our crazy idea.”

The company also secured investment from the Wisconsin Alumni Research Foundation, and by 2020 had raised an early-stage investment round led by two Silicon Valley venture capital firms. Solid funding has helped the company gain a foothold, but competing in the AI data race is no easy task. Patel credits team resourcefulness and capacity for developing “bleeding-edge techniques” for much of DataChat’s success. He adds, “We are competing with giants that are spending hundreds of millions or billions of dollars to reach similar goals.”

So, what distinguishes DataChat as an emergent force among the competition? One factor is its enablement of user-to-system communication with a specific subset of English called Guided English Language (GEL). Patel explains, “It has its roots in industries like aviation, where you want a subset of English that has no ambiguity but is capable of expressing all that you need for a specific application.”

LLMs automatically convert user questions into code and, after running it, present answers. “But how does a user trust the answer?” Patel asks. It’s a question reflective of concerns over AI’s transparency and reliability. DataChat’s answer is a “recipe,” a proof of the work that went into producing an answer presented in GEL, and therefore understandable for users. As Patel puts it, “transparency builds trust.”

With innovations like Ask, user feedback also helps train the system for enhanced future performance. Users can “thumbs up” or “thumbs down” answers they receive and manually edit steps of the recipe.

DataChat’s focus on accessibility and transparency in AI technologies underscores current needs in Madison and well beyond. “To truly democratize data, you need to make the data science toolbox accessible and usable by everyone,” Patel says. “That is exactly what we do. Companies can start to upskill their existing users … while simultaneously relieving some of the overhead for any of their more advanced users.”