Building Scalable Predictive Data Pipelines. Data DevOps way of realizing the Data Pipelines.
We care about trends, tools, languages, platforms, practices, patterns, frameworks that aid in solution-ing, designing, architect-ing, developing, implementing, deploying, maintaining scalable Predictive Data Pipelines.
The list will only give a glimpse of areas we care about, but the list can go really long.. Ingestion, Cleansing, EDA, Feature Engineering, Statistical Analysis, Model Building, Auto Training/Re-Training, Model Performance monitoring, Dash-boarding Model Predictions, Pattern Recognition and so on.
Crunching variety of data kinds and building Pipelines around - tabular, semi-structured, time-series, streams, images, audio/speech, video, graphData, text, geoData etc
This is open to any data enthusiasts.
First meeting will be an introduction and setting the expectations for this meetup. The moderator will pick a topic for the next meetup based on majority vote. Members are requested to come with topics of interest.
Example Topics can be as follows-
Kubernetes based Data Processing/ML frameworks. Building a basic Fraud Detection DataPipeline on Stream Data. Best Practices in building a scalable Data Pipelines. Building a Fault tolerant Data Pipeline Stages. Raise of Functional Languages/Functional Constructs for Distributed Systems. Building a text summarization pipeline. Product-ionizing a deep learning model. Picking a language for Predictive Pipelines. Building Docker Images for ML solutions.
Meetup will be more discussion-based rather than instruction-based because of sheer vastness of the area we are indulging in.
We can do demos, discussions, teaching, compete/debate on solution-ing/architect-ing Data Pipelines. Any new ideas are appreciated.