Skip to content

Real-world Hadoop applications (built in Bucharest)

Photo of Jason Radisson
Hosted By
Jason R.
Real-world Hadoop applications (built in Bucharest)

Details

Hi everybody,

We're happy to announce our inaugural August meet up. Let's get together, learn about each other's interests and fire up some talks!

At Avira Romania we've developed several applications on the Hadoop stack that are powering millions of customer interactions per hour. For our first meet-up I'd propose we present a couple of our current consumer use-cases.

My employer is also sponsoring the event with location, drinks & food.

The talks:

• Calin Burloiu, Couchdoop

• Corneliu Balaban, Soft Authentication

Abstracts:

1.) Couchdoop by Calin Burloiu

Couchdoop is a Hadoop connector for Couchbase which is able import, export and update data. The connector can be used both as a command line tool, which works with CSV files from HDFS, and as a library for MapReduce jobs, for more flexibility. The library provides a Hadoop InputFormat which is able to read data from Couchbase by querying a view and an OutputFormat which can store in Couchbase key-value pairs read from any Hadoop source. The OutputFormat also allows other useful operations like deleting, counting or changing the expiry of some documents. Couchdoop can be used to update some existing Couchbase documents by using data from other Hadoop sources. Imagine a recommendation system which stores item scores in Couchbase documents. After rerunning a machine learning algorithm over user events data from Hadoop the scores from Couchbase can be updated directly. Couchdoop aims to be a better alternative for the official Couchbase Sqoop connector which is only able to import a full bucket or to stream documents for a configurable amount of time.

2.) Soft Authentication by Corneliu Balaban

Soft Authentication (SAUTH) is a large scale backend application that authenticates and manages a company’s users, the products and devices they are using while offering complete anonymity and privacy with anonymous tokens. Using Java, CouchBase, Flume and Hadoop for persisting the user data we are able not only to authenticate the users or create user profiles in realtime but simultaneously to identify the devices that they are using our products on in order to deliver them maximum security. SAUTH supports tens of thousands of operations per second and provides maximum flexibility for enriching and serving customer profiles (due to a schemaless database) directly from it’s in-memory database to our company’s consumer-facing applications. SAUTH also has various fuzzy matching algorithms enabling it to make user mulitiple-device (any type of device) mapping at runtime or determine whether an unregistered user is actually a known registered one.

Photo of Hadoop User Group Bucharest (HUG) group
Hadoop User Group Bucharest (HUG)
See more events
Avira Romania
26 Armand Calinescu Street, 4th floor · Bucharest