

De quoi s'agit-il
To see all meetups in this group: https://www.meetup.com/pro/ibm-community/
This is an IBM sponsored Meetup group geared towards developers, data scientists, data engineers, and ALL Big Data, Cloud and AI enthusiasts. Our meetups provide an opportunity to work hands on with the solutions and tools in our Big Data portfolio and to interact and share knowledge with experts at IBM and in our extended community.Our Meetups typically include a 45-60 min (max) presentation that serves as an introduction and overview for a specific Big Data technology. It is followed by ~3 hours to collaborate with fellow developers and apply your Big Data skills. Depending upon the location, we can provide a cloud environment that you can run through the browser of your laptop at NO cost to you. Our meetups are FREE.
Meetup topics include:
- Hadoop-based analytics
- Open Source Hadoop, SQL on Hadoop, R on Hadoop, Integration, Governance, ...
- Real Time Analytics & Stream Computing
- Text Analytics
- Visualization and Discovery tools for Big Data
- Big Data App Development
- Big Data & Cloud
- NoSQL
- Internet of Things (IoT)
- Deep dives into the technologies that makes big data processing possible
- Anything and everything about Big Data
Join us today for a hands on software development experience.
Sponsors
Tout voirÉvénements à venir (2)
Tout voir- Événement de réseau148 participants de 111 groupes hébergeant[AI Alliance] Chat with your website using an LLMLien visible pour les participants
Abstract
Imagine being able to ask questions about a website in natural language—and receiving meaningful answers instead of simple keyword matches. In this talk, I’ll introduce Allycat, an open-source, end-to-end stack that enables conversational interaction with website content using Large Language Models (LLMs).We’ll walk through the complete pipeline:
- Crawling and indexing website content
- Cleaning and extracting meaningful information from HTML
- Creating embeddings and storing them in a vector database
- Querying the data using an LLM for contextual, accurate responses
We’ll also demonstrate Allycat’s lightweight UI that allows users to interactively test their queries. The entire stack is built with Python and open-source components, making it easy to adopt, adapt, and extend.
You can checkout Allycat here : https://github.com/The-AI-Alliance/allycat
Audience
AI/ML Engineers, Data Engineers, Data Scientists interested in building intelligent, LLM-powered search and chatbot interfaces.Level
Beginner to IntermediateFormat
45-minute presentation with demonstrationAbout the speaker
Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.About the AI Alliance
The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. - Événement de réseau151 participants de 111 groupes hébergeant[AI Alliance] GneissWeb: Preparing High Quality Data for LLMs at ScaleLien visible pour les participants
Details
IBM recently released GneissWeb, a large dataset yielding around 10 trillion tokens that caters to the data quality and quantity requirements of training Large Language Models. In this talk i will do a deep dive on the philosophy behind this dataset, where it stands w.r.t the other datasets out there, how to recreate it based on the tools IBM has open sourced and some performance figures with it. This talk will be a followup of the talk given by Shahrokh Daijavad of IBM in the month of March.Prerequisites
This is a follow up to our March 6, 2025 session “Introducing GneissWeb - a state-of-the-art LLM pre-training dataset“:- Check the GitHub show notes
- Re-watch on YouTube
About the presenter
Bishwaranjan Bhattacharjee (LinkedIn), Senior Technical Staff Member and Master Inventor, IBM ResearchAbout the AI Alliance
The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.
Événements passés (206)
Tout voir- Événement de réseau251 participants de 109 groupes hébergeant[AI Alliance] Introducing gofannonL'événement est passé