Indika AI places bets on artificial data to conquer problems with real-world data


What is common to self-driving vehicles and fraud detection tools and languages system? These seemingly separate scenarios share one common thread: synthesized facts. Synthetic data helps autonomous vehicles respond better to the challenges of the real environment. It is aiding in the detection of fraud in the fintech industry as well as developing language systems for training.

Quality data is required for the training of Artificial Intelligence (AI) solutions to ensure that the effectiveness that is generated by the AI model is exactly as desired. But real-world data is not without its issues of privacy, security access, cost and accessibility, along with the limits of scale and size.

That’s where the synthetic data is available, with the promise of increased scale, privacy security, and precision. And this is exactly what Indika AI, a Mumbai-based start-up Indika AI is betting on. It is creating a platform for the generation of synthetic data in different areas like legal, medical, and finance AI.

Synthetic data preserves all the information of real-world data , but not the identity. It helps to overcome privacy and regulatory issues relating to data sharing” states Hardik Dave, the Founder and Chief Executive Officer of Indika AI.

Read more How TrulyMadly Onboarded 11 Mn+ Users By Building A Safe Online Dating Experience

Established in 2021 in 2021 by Hardik Dave and Dr Anshul Pandey, Indika AI is a company providing data solutions that collaborates with AI companies across the world to assist them in training the data to create effective AI models.

The annotation of the data

A AI model needs to be taught to understand specific details to be able to make decisions and then take action. For this data, it needs to be labeled or notated in order it can be labelled so that an AI solution can recognize specific areas of interest, distinguish objects, and discover hidden patterns, intentions, contexts and feelings within the data.

For example, Indika AI has worked on a project that labels financial news in order to create an AI-based price forecasting instrument. The tool recommends stocks an investor ought to add to their portfolios based on information on prices of stocks.

Datasets to train for AI companies operating in niche and restricted sectors, like legal and financial services, as well as medical and AI is a challenge because of the requirement for strong domain knowledge as well as subjectivity, the security of data and regulatory concerns Hardik says. Hardik.

This is the reason the reason why synthetic data generation could be a game changer in the near future in that it could provide databases that are comprehensive as well as more accurate, reliable and free of bias, he claims. Of course the quality of synthetically produced data will depend on the accuracy of the AI model that creates this data.

What data synthetically adds in the tables

The synthetic data can be described as an algorithmically generated (artificially produced) data that approximates the characteristics of real-world data, including textual data, tabular data, videos, images and spoken. The process involves feeding data in the AI model to create synthetic data that could be an effective alternative or supplement for real-world data.

Presently, the majority of AI models have been trained using real-world data. Only tiny percentages of models utilize artificial datasets. But this could change in the coming years according to Hardik.

Synthetic data will not just be able fix the flaws that exist in AI training data in cases that real-world data are not available, isn’t usable due to privacy or security reasons or the cost. It would also generate larger data sets for testing and training AI model, as he claims.

Presently, Amazon uses synthetic data to train Alexa’s system of learning language as well as Google’s Waymo utilizes synthetic data to train its self-driving vehicles as well, and American Express and J. P. Morgan utilize synthetic financial information to aid in the detection of fraud.

In 2024 60% of data that is used in the creation of AI and analytics projects will be artificially produced, according to the research and consultancy firm Gartner.

What is it that Indika AI is doing

Indika AI has been currently developing a platform for the generation of synthetic data across various fields like medical, finance as well as legal AI. It expects to debut this platform in three months.

The synthetic data platform of the company would have compatibility with tabular data at first with and also accommodate different types of data, like unstructured text and images, when the platform scales.

Hardik provides an example of the possibility of a use-case in the field of financial service. Let us suppose that a dataset currently in use doesn’t contain credit card spending information of females in the 30s with a monthly income of Rs 10 lakhs and who reside in cities of Tier 3. A synthetic data set could be constructed by analyzing the spending patterns of other users, based on every factor like gender or age, income range and the where they live.

Additional solutions

Datasets are usually labeled manually, which takes a lot of amount of time, effort and. So, the startup is working on a platform that allows automatic or programmed labelling of datasets. This will enable annotation quicker and more efficient. The startup is hoping to launch the platform in the next six months.

In the next few years, Indika AI also wants to work on a platform that will simplify procedures and workflows involved in data collection and annotation to ensure that more experts from the domain can be able to contribute to AI research and development as well as training.

The company is also developing solutions for labeled and ready-to-use data for the upcoming usage cases.

Growth and size of the market

Indika AI collaborates in partnership with AI firms in India, North America, and Europe to design and develop custom strategies. It competes against companies such as iMerit Technology, Scale Labs as well as Appen.

The company raked in total of Rs 60 lakh during the first ten months following the company’s inception in May the previous year. The startup anticipates a bigger growth in the coming fiscal year, and plans to reach around five crore in revenue.

In the past the year Indika AI raised an unknown amount through a pre-seed round led by Dr. Anshul. The company is planning to raise seed funding in the near future.

The global market for data collection and labelling size was estimated by $1.67 billion by 2021. The market is predicted to grow by 25% from 2022 until 2030, according to the business consultant business Grand View Research.

The industry of data annotation in India is in its infancy phase. According to an NASCOM report, the market for data annotation that is served by India will be worth more than $7 billion in 2030.

Team Indika AI

The Indika AI team includes more than 100 employees, including domain experts as well as data researchers, solution architects, and certified annotators.

Hardik Hardik, who is the head of Legal AI team has experience in tax, corporate as well as IP legislation. He had previously worked for consultants Ernst & Young and Baker Tilly International. Dr. Anshul co-founded Indika AI, is also co-founder of Accern, a non-coding fintech AI firm that is based within the US.

Read more Accessible, affordable, and inexpensive: VitusCare is focusing on the three A’s of kidney treatment within Tier I and II


Please enter your comment!
Please enter your name here

Share post:




More like this

ERP Modules: Types and Characteristics

Multiple options might be discovered while examining business management...

Agritech start-up Otipy is appointing Rohit Sood as its chief executive officer

Rohit Sood has been selected to be the chief...

Nykaa Q2: Profits up 333% year-over-year to the sum of Rs 5.2 crore Revenue up 39%

The second quarter in the fiscal year currently in...

Decentro is a fintech company has received $4.7M in the Series A round of capital

Funded by Y Combinator, a banking and payments API...