Skip to content

Big Data

Meet other local people interested in Big Data!
pin icon
3,407
members
people1 icon
2
groups

Largest Big Data groups

  • Photo of the user Member 1
  • Photo of the user Member 2
  • Photo of the user Member 3
2,296 members

Frequently Asked Questions

Yes! Check out big data events happening today here. These are in-person gatherings where you can meet fellow enthusiasts and participate in activities right now.

Discover all the big data events taking place this week here. Plan ahead and join exciting meetups throughout the week.

Absolutely! Find big data events near your location here. Connect with your local community and discover events within your area.

Big Data Events Near You

Connect with your local Big Data community

Building Scalable Customer Identity Resolution Pipelines on AWS Using AI
Building Scalable Customer Identity Resolution Pipelines on AWS Using AI
Customer identity resolution becomes increasingly complex as organizations scale across multiple systems, regions, and data formats. Traditional rule-based approaches often fail to keep up with data variability, require constant manual tuning, and struggle with real-time processing needs. This session presents a practical approach to building a scalable identity resolution pipeline using AWS services and modern AI techniques. The architecture combines data ingestion through Amazon S3 and AWS Glue, transformation pipelines using Spark on EMR, and machine learning models deployed via SageMaker for entity matching and standardization. Graph-based relationship modeling is implemented using Amazon Neptune to improve resolution accuracy by incorporating household and shared attribute context. We will walk through how machine learning models can be used for name and address normalization, how intelligent blocking strategies improve matching efficiency, and how feedback loops can be introduced to continuously improve accuracy. The session also highlights how serverless components such as AWS Lambda can be used for orchestration and real-time processing. **SPEAKER BIO** Mosaic Syed is a Senior Data Engineering and Cloud Solutions Architect with over 20 years of experience designing and delivering scalable, secure, and high-performance data solutions across global enterprise environments. https://www.linkedin.com/in/mosaic-basha-syed-92300856 **CALL FOR SPEAKERS** Learn more: [https://www.awscolumbus.com/get-involved/](https://www.awscolumbus.com/get-involved/) **THANK YOU** *VEEAM* for hosting our meetup! To learn more about *Veeam*, please visit their website: [https://www.veeam.com/](https://www.veeam.com/) **DIRECTIONS** 8800 Lyra Dr #450 · Columbus, OH go to 4th floor. **Want to sponsor the pizza and/or bar tab?** Please contact me if you would like to sponsor this meetup's pizza and/or bar tab: angelo@mandato.com
CBusData - Practical AI for Power BI Developers
CBusData - Practical AI for Power BI Developers
Practical AI for Power BI Developers A year ago, “agentic AI” was mostly hype for Power BI teams. Today, it deserves your undivided attention. For Power BI pros, there is now a real opportunity to reduce repetitive development work, accelerate delivery, and help developers do more, but only when strong DataOps practices are in place to make AI workflows effective. This session is a no-nonsense introduction to effective AI patterns for Power BI and Fabric development. Along the way, we will make sense of the growing pile of terminology, including skills, plugins, hooks, and MCP. You will see examples of how modern AI tooling can help with development tasks across Power BI and Fabric, along with the prerequisites, guardrails, and DataOps principles needed to use it responsibly. Whether you're burned out on AI hype or already using Copilot CLI daily, this session will show you the foundations that are finally making AI-assisted development genuinely useful.
COhPy Monthly Meeting
COhPy Monthly Meeting
**Improving Office in Franklinton** Physical location: Improving Office 330 Rush Alley Suite #150 Columbus, OH 43215 Schedule: 6:00 p.m.: Socialize, eat, and drink. Improving will be providing pizza and beverages. 6:30 to 8:00 pm. Main meeting and presentation(s). Topic: This month John Lairson will share a notebook describing the Alpaca (Paper) Trading API and discuss different algorithms for evaluating stock trades. We meet on the last Monday of each Month. Presentations are given by members and friends of this group. If you would like to do a presentation (small or large) on a python topic, please contact Central OH Python at centralohpython@gmail.com
Best Practices for Building a Reliable Lakehouse
Best Practices for Building a Reliable Lakehouse
**Abstract:** This is a practical playbook for building a production-grade data lakehouse. It walks through foundational principles — naming conventions, least-privilege access, automated CI/CD testing — before diving into medallion architecture. Furthermore, metadata-driven design patterns show how configuration tables and dynamic notebook orchestration eliminates hard-coded pipelines. The deck covers star schema modeling, guidance on choosing between Spark, Pandas, and SQL, and data quality enforcement using DQX with YAML data contracts. Finally, we dive into security best practices and performance optimizations. **Host:** Justin Shea, Mehdi Jeddi, Erik Pak, and Sou-Cheng Choi **Talk Format:** This is a hybrid event. To attend online, join us on Zoom here at 6pm: https://iit-edu.zoom.us/j/89379230295?pwd=NdETyE5sdYuSrvsrBZXSBFkUESBVkg.1 Meeting ID: 893 7923 0295 Passcode: 5t5WYn **Sponsor:** Adyen, UIC College of Business, and PyData Chicago co-host this event. UIC will provide the meeting site. Adyen will sponsor pizza and soft drinks for the onsite participants. **Address:** University of Illinois - Chicago, Douglass Hall, Room 220, 705 S Morgan St, Chicago, IL 60607 **Logistics:** “UIC Douglass Hall” is recognized on Google Maps, which can guide you through campus. Once you arrive, proceed to the second floor, room number 220
LLM Showdown: ChatGPT vs Claude vs Gemini vs Local Models
LLM Showdown: ChatGPT vs Claude vs Gemini vs Local Models
Join us for a practical, beginner-friendly guide to choosing the right large language model. We’ll compare major models like ChatGPT, Claude, Gemini, and Llama, talk about when to use hosted APIs versus local models, and break down the tradeoffs around cost, speed, quality, privacy, context windows, coding ability, and reliability. You’ll leave with a clearer mental model for picking an LLM based on your actual use case instead of hype, benchmarks, or brand names. No deep AI background required. LOGISTICS AND PARKING: The talk starts at 7:00 PM. The first half hour is reserved for everyone to get set up and mingle. Free pizza and drinks! The cheapest parking option is to find street parking, which will only cost you a few bucks. Otherwise, park in the nearby veteran's museum lot for $8. It's highly recommended you avoid the nearby $15 garage parking.
DoJo (Informal Python Meeting)
DoJo (Informal Python Meeting)
**Latest Dojo Location!** **Knotty Pine Brewing** 1765 W 3rd Ave, Columbus, OH 43212 We're going to try a new dojo location for a few weeks and see how it works Dojos are informal Python group study sessions where everyone interested in Python gathers to learn about Python, help others with Python, or just hang out. Everyone is welcome from Python beginners to experts. Bringing a laptop is encouraged (we'll have extension cords and power strips). If there's something you want to learn leave a comment on this invite so we can plan ahead. We're looking for speakers for our Monthly Meetups! Fill out the form if you are interested in presenting to the Python Community. https://forms.gle/ehSfUAC2WgR34Crq9
NFT AI ART Columbus
NFT AI ART Columbus
NFT's are here to stay folks! This is a group for like minded people interested in understanding, leveraging, using, creating for, profiting from, trading too i suppose, NFT's.. everything around them, complexity, fear and exploits, best practices and more. **PLUS** This group will talk AI ART tools, techniques, artists, video, audio, prototypes and more in the AI assisted production space- ART specifically, but we can get into any aspect of some of the cooler things happening in AI in general.