Today is May 27, 2024. You're reading '7/1', Monday edition of Startupt
Welcome to this week Monday Edition! I'm Sergei Yakupov, editor of Startupt.co, and today we're highlighting the booming data analytics market and an innovative startup on the market, Jsonify.
The data analytics market is projected to grow from USD 64.3 billion in 2023 to USD 226.2 billion by 2028, at a CAGR of 28.6%. Key trends for 2024 include democratization of data, AI-powered insights, and embedded analytics, making data more accessible and driving better decision-making.
This edition features Jsonify, a startup transforming web data into structured formats using AI. With 20 paying customers and a strong growth trajectory, Jsonify is poised to revolutionize data management. Don't miss our exclusive Q&A with founder Paul Hunkin, who shares insights into their technology and future plans. And by the way, have a look, what Jsonify found on our website.
We also include a playlist of mathematically precise songs to inspire you, from Tool's "Lateralus" to Foals' "Mathletics."
Stay tuned and see you next week. Don't forget to check your email on Thursday for our regular newsletter with the latest startup news from Portugal.
And when you read the newsletter up to the very end, come back here to go to this link to watch Paul's demo of Jsonify in New York this May.
Forward this email to your friends or colleagues. Or share the link to the web-version on social media.
The data analytics market is forecasted to expand from USD 64.3 billion in 2023 to USD 226.2 billion by 2028, registering a compound annual growth rate (CAGR) of 28.6%. This growth is driven by the increasing need for data-driven decision-making, uncovering hidden patterns, identifying opportunities, and mitigating risks.
In 2024, three pivotal trends will redefine data analytics and business intelligence (BI): democratization of data, AI-powered insights through augmented analytics, and the shift towards embedded analytics. These trends aim to make data universally accessible, facilitate deeper insights, and enhance decision-making across organizations.
Offering: The solutions segment is projected to dominate the market during the forecast period. Advanced analytics solutions leverage statistical, mathematical, and machine learning techniques to uncover patterns and trends. Organizations prefer deploying these solutions either on-premises or via cloud platforms based on their operational requirements, security concerns, and budget considerations.
Business Function: The operations and supply chain segment is expected to witness the highest CAGR. Data analytics in this segment helps organizations balance operational costs, speed, flexibility, and quality. Companies are increasingly adopting technologies such as IoT, ERP systems, cloud computing, and social media for operational management.
Regional Insights: North America is anticipated to lead the market, driven by the adoption of advanced technologies, robust infrastructure, and an ecosystem fostering innovation in AI. The presence of leading companies like IBM, Oracle, Microsoft, and SAS Institute further strengthens this region's market position.
The data analytics market is on a steep growth trajectory, underpinned by technological advancements and a strong shift towards data-driven strategies. As companies continue to harness the power of big data, the emphasis on advanced analytics solutions will only intensify, creating a fertile ground for innovation and growth.
Name: Jsonify
Location: Lisbon
In today's digital age, with over a billion websites and an explosion of unstructured data, businesses face the daunting task of managing and utilizing this vast sea of information. Enter Jsonify, a groundbreaking AI browser agent that promises to make the internet your database. By leveraging cutting-edge computer vision and large language models (LLMs), Jsonify automatically transforms web data, documents, and search queries into structured, up-to-date information.
Data is growing at an unprecedented rate, with unstructured data representing 80-90% of all new enterprise data and growing three times faster than structured data. Traditional methods of data scraping are not only time-consuming but also brittle and inefficient, leaving businesses struggling to keep their data current and usable.
Jsonify addresses these challenges with its AI-powered agents that browse, scrape, and transform data from multiple sources into structured formats like CSVs, APIs, and Sheets. The platform supports seamless integrations with popular tools like Airtable, Bubble, and HubSpot, enabling businesses to build complex data pipelines with ease.
AI Browser Agents: Automate data extraction and conversion from websites and documents into structured data.
No-Code Dashboard: Create and manage data pipelines without needing extensive technical knowledge.
Developer API: Easily integrate Jsonify into existing systems for customized data solutions.
Scalable and Self-Updating: Keeps data fresh and reduces the need for manual updates.
Since its beta launch, Jsonify has gained traction with 20 paying customers across various industries, generating a Monthly Recurring Revenue (MRR) of approximately $4,000. With a Total Addressable Market (TAM) of $4 billion and ambitions to capture a significant share, Jsonify is poised for rapid growth.
Jsonify boasts a world-class team with deep expertise in big data, AI, and product design. The leadership includes veterans from companies like NASA, Google, and several successful startups, ensuring a robust foundation for innovation and growth.
Jsonify has secured $500,000 from notable investors like Betaworks, Differential, and Mozilla Ventures and is currently raising an additional $2 million to extend its runway and accelerate product development and market expansion.
Q: What inspired the creation of Jsonify, and what key problem did you aim to solve with your AI agents and no-code platform?
A: I was doing a bit of fun weekend coding -- it wasn't originally meant to be a serious startup, just a fun little hack. At the time I was frustrated with monitoring tools like VisualPing that alert on website changes when there's only a minor pixel difference, and I had an idea to use some fun computer vision + data synthesis to understand the page, extract the meaningful information, and only alert when that changes. I started speccing out a little proof of concept, and halfway through realized I'd also invented a way to make a nice little AI data scraper. That sounded pretty fun on its own, so I built that as well -- you put in a URL and it'd give you a textbox of JSON. The very first end-to-end working run!
I posted it in a few places, went out to lunch, and came back to find a ton of people trying it. Clearly I'd stumbled onto something people wanted. I then spent the next couple of months refining the vision, talking to users, and validating the idea with some $1 payments. After some time though, it became clear that while data scraping and alerting are great, there's a big need for turning unstructured data in all forms -- webpages, documents, emails, chat logs, etc -- into structure that can be easily understood, transformed and used by both humans and machines. So while right now we're starting with web and documents, our long-term goal is to build the "zapier for data" -- in much the same way Zapier connects actions, we connect sources, transformations and outputs of data.
Q: Can you explain the underlying AI and computer vision technologies that power Jsonify’s smart agents? How do they autonomously turn any webpage into meaningful JSON data (if this is not a secret sauce you don’t want to speak about)?
A: There's a bit of secret sauce involved, but on a high level: we use a customer computer vision model -- so, not GPT4-v or similar -- to label and understand page structure. And we do an OCR pass, and pull data out of the webpage DOM. And then we combine and feed all of that over to another model that can synthesize data, and it makes JSON, which we can then also express as a CSV. Easy! 😅 An example of the labelling model in action:
Q: How does the no-code dashboard work for users? What steps do users take to create and customize their data tasks using Jsonify’s platform?
A: Users can set up data pipelines ("workflows") using the dashboard. They're deliberately pretty simple, you can generally read from top to bottom. "Every 24h, open these pages, extract all their data, sync it to a Google sheet"
It's simple, but flexible enough that variations on these have meant a lot of cool stuff is possible.
Q: What are some of the most innovative or surprising tasks that Jsonify’s AI agents have been able to accomplish for your clients?
A: My favorite users are always the ones who, when they see Jsonify, they say: hey, that gave me a great idea for a startup! We're seeing multiple users start whole new startups on top of the system, which is amazing and really validates this whole "zapier for data" platform play. The most fun user is probably a startup over in Hollywood, who is using Jsonify agents to pull data out of callsheets -- a daily film schedule document -- for different studios. It turns out each studio and production uses different ad-hoc PDF, docx and webpages for their documents, and they change all the time. But Jsonify is able to make sense of them, turning them into the same structure reliably. And there's another really fun startup using us to build a global shopping cart, letting you add any product from any ecommerce website to a single unified list. Thanks to Jsonify being able to turn any product page into a standard 'product' datastructure, they're able to automatically support every page without building specific integrations. This would have been completely impractical to build before Jsonify -- maintaining site-specific scrapers for every ecommerce vendor would quickly drive the developers mad.
Q: With the platform processing nearly a million requests, how does Jsonify maintain high performance and scalability for a growing number of users and data tasks?
A: It's tough! Scaling right now is our big challenge -- it's a great problem to have, but a problem nonetheless. We optimized for accuracy to start, but we're now working on scale too. We've built a pretty good cloud infrastructure that's able to scale up and down, and speed is improving a lot.
Q: In a landscape with tools like Zapier, how does Jsonify differentiate itself in terms of functionality, flexibility, and user experience?
A: We're 100% focussed on our data-nerdery space. There's a lot of companies like AgentHub, Twin, Zapier -- who are working in the action space and do data 'on the side'. We're not interested in taking actions on webpages, we're strictly data side. This is a tool that's really easy to build… badly. If you ask ChatGPT to do this, it sure can, about 50% of the way. People don't want 50%, they want it to work all the time, everywhere. That is quite hard. And luckily this has proven pretty sticky. We're generally such an important part of the data pipeline, that users tend to stick with us over time.
Q: What upcoming features or improvements are you most excited about? How do you see Jsonify evolving in the next 2-5 years to continue helping businesses grow with reliable data?
A: Data is only growing. I could cite figures and studies showing how much time we all spend dealing with it, but I suspect I don't need to -- anecdotally, just look at all the webpages you need to check, emails you get, messages you send and receive, documents piling up in Dropbox or iCloud. All increasing. Our big vision is that we'll be the central information clearinghouse for bringing all unstructured data under control. Whenever there's a pile of chaos threatening to bury people -- there'll be us, to automatically trawl through the soup, bring out the useful parts, and present it in a way that's useful.
Today we have not a playlist with usual songs written by specific artists or on specicfic topic. This playlist consists of songs written with math in mind — literally. All songs have something with data and math — structure, rithm, sequences of chords etc. Listen them on Spotify:
"Lateralus" by Tool: This song is a masterpiece of mathematical precision, inspired by the Fibonacci sequence. The syllables in the lyrics follow this sequence, and the time signatures shift frequently, creating a complex, spiraling rhythm that mirrors the golden ratio. The song explores themes of human evolution and consciousness, perfectly blending intellectual complexity with emotional depth.
"The Dance of Eternity" by Dream Theater: Known for its incredibly intricate structure, "The Dance of Eternity" is a showcase of Dream Theater’s technical prowess. The instrumental track includes over 100 time signature changes, moving seamlessly between 5/8, 7/8, and other unusual meters. Each band member displays virtuosic skill, making it a challenging piece for both performers and listeners.
"Tom Sawyer" by Rush: A classic of progressive rock, "Tom Sawyer" features intricate drumming by Neil Peart, known for his use of polyrhythms and odd time signatures. The song shifts between 7/8 and 4/4, creating a dynamic and engaging rhythm. Its precise instrumental interplay and thought-provoking lyrics have made it a staple of the genre.
"Bleed" by Meshuggah: This song is a prime example of Meshuggah’s use of polyrhythms and precise, mechanical drumming. The guitars and drums often play different rhythms simultaneously, creating a dense and challenging listening experience. The relentless precision of "Bleed" showcases the band’s technical skill and innovative approach to rhythm.
"Fracture" by King Crimson: "Fracture" is a tour de force of complex time signatures and intricate guitar work by Robert Fripp. The song moves through various sections, each with its own challenging rhythmic structure, including 11/8 and 13/8 meters. The intricate interplay between instruments creates a rich, layered soundscape that is both intellectually stimulating and musically satisfying.
"Scarified" by Racer X: "Scarified" is an instrumental track known for its blistering speed and technical complexity. Paul Gilbert’s guitar work features rapid alternate picking, sweeping arpeggios, and intricate tapping sequences. The song’s precise and challenging riffs are a testament to the mathematical approach to composition, making it a favorite among guitar enthusiasts.
"Windowlicker" by Aphex Twin: Search on Youtube for the video with this song, and at the end of it you will see an image of the face of an author — Richard D. James in the spectrogram.
"Strawberry Fields Forever" by The Beatles: Known for its innovative production techniques and complex arrangement, this song features unusual chord progressions and time signature changes that create a dreamy, surreal soundscape.
"Change (In the House of Flies)" by Deftones: Known for its haunting melodies and complex rhythmic structure, this song incorporates unusual time signatures and polyrhythms. The interplay between the instruments and Chino Moreno's dynamic vocal performance adds to the song's intricate, layered sound.
"Mathletics" by Foals: True to the math rock genre, "Mathletics" features irregular time signatures and shifting rhythms that challenge traditional song structures, creating a unique and engaging listening experience.
That's all for today! See you next week. And don't forget to check your email on Thursday — you'll find a regular Startupt.co newsletter with the most important news about startup industry in Portugal.