dictshield

From: James Dennis
Sent on: Monday, November 29, 2010 6:22 AM
Hello again -

I have another project that might be useful to people. It's called DictShield. And it's exceptions are DictPunch's!


It's reasonable to think about DictShield like MongoEngine (where it came from) with the actual database interactions stripped out. I found some cases where MongoEngine didn't perform as well as using pymongo directly but I liked having a system for structuring data. I then expanded upon that and now DictShield is how I manage my document structures & validation.

It behaves a bit like a typed dictionary for Python with the goal of making it easy to take input from a remote source, validate it, and then translate it into a document ready to be stored in a nosql store like MongoDB. It's mongocentric for now but I'm like to expand it.

It also attempts to make it easy to reshape dictionary's removing certain fields before transmission. This pis useful primarily for cleaning the dictionary's depending on who they're being sent to. The three layers of concern are: internal data, data sent to the owner of the document, data sent to the public.?


DOCUMENTS

Defining a document looks similar to defining a Django model:
    class Media(Document):
        """Simple document that has one StringField member
        """
        title = StringField(max_length=40)
I'll instantiate the class and give it a value.
    m = Media()
    m.title = 'Misc Media'
This produces a python dictionary like this:
    {
        '_types': ['Media'],
        '_cls': 'Media',
        'title': u'Misc Media'
    }
We see '_types' and '_cls' ?which tell us the class hierarchy and class instance used for this dictionary. As this implies, you can subclass Media and see that represented in the dictionary also:
    class Movie(Media):
        year = IntField(min_value=1950, 
                        max_value=datetime.datetime.now().year)
        personal_thoughts = StringField(max_length=255)
This dictionary looks like:?
    {
        'personal_thoughts': u'I wish I had three hands...', 
        '_types': ['Media', 'Media.Movie'], 
        'title': u'Total Recall', 
        '_cls': 'Media.Movie',
        'year': 1990
    }

VALIDATION
Validation is made easy with DictShield via classmethods on Document OR a validate() function on document instances. First, here is a User document which uses an MD5Field for it's password?(use bcrypt tho).
    class User(Document):
        _public_fields = ['name']
        secret = MD5Field()
        name = StringField(required=True, max_length=50)
        bio = StringField(max_length=100)
We also see _public_fields. I will get to that soon.
First, let's enter some bogus data by setting the md5 to something that doesn't look like an md5.
    u = User()
    u.secret = 'whatevz'
    u.name = 'test hash'
This will fail to validate by throwing a DictPunch exception. We don't like bad data, so we punch it in the dict and send it on it's way.
try: u.validate() except DictPunch, dp: print 'DictPunch caught: %s' % (dp))
An MD5Field looks like this:
    class MD5Field(BaseField):
        """A field that validates input as resembling an MD5 hash.
        """
        hash_length = 32
        def validate(self, value):
            if len(value) != MD5Field.hash_length:
                raise DictPunch('MD5 value is wrong length',
                                self.field_name, value)
            try:
                x = int(value, 16)
            except:
                raise DictPunch('MD5 value is not hex',
                                self.field_name, value)
As you can see, the field is basically just a validate() function.

HANDLING POTENTIALLY ROGUE INPUT
So you just ran json.loads on some input and now you want to figure out if it's valid. You can first trim the keys in the dictionary down to just what's found in the document.
    total_input = {
        'secret': 'e8b5d682452313a6142c10b045a9a135',
        'name': 'J2D2',
        'bio': 'J2D2 loves music',
        'rogue_field': 'MWAHAHA',
    }
First, we'll validate the fields to see if it's worth moving forward with this document.
    try:
        User.validate_class_fields(total_input)
    except DictPunch, dp:
        print('  DictPunch caught: %s\n' % (dp))
Validation passed, so then we'll translate it to a python dictionary with mongo types handling by calling .to_mongo().
    user_doc = User(**total_input).to_mongo()
After this, our document looks like below:
    {
        '_types': ['User'], 
        'bio': u'J2D2 loves music', 
        'secret': 'e8b5d682452313a6142c10b045a9a135', 
        'name': u'J2D2', 
        '_cls': 'User'
    }

OUTPUT SHAPING
DictShield offers methods of shaping the output depending on who you're sending it too. Let's revisit that Movie class from earlier. Remember _public_fields?
    class Movie(Media):
        """Subclass of Foo. Adds bar and limits publicly shareable
        fields to only 'bar'.
        """
        _public_fields = ['title','year']
        year = IntField(min_value=1950, 
                        max_value=datetime.datetime.now().year)
        personal_thoughts = StringField(max_length=255)
First, there is the internal structure of a document which represents exactly how it's stored in Mongo.?
    movie_doc = Movie(**our data).to_mongo()
    {
        'personal_thoughts': u'I wish I had three hands...', 
        '_types': ['Media', 'Media.Movie'], 
        'title': u'Total Recall', 
        '_cls': 'Media.Movie',
        'year': 1990
    }
Then, we can call Movie.make_json_ownersafe(movie_doc) and DictShield removes _types and _cls. These are considered internal and are defined in dictshield.documents.Document as _internal_fields. _internal_fields is a black list.
    {
        'personal_thoughts': u'I wish I had three hands...',
        'title': u'Total Recall',
        'year': 1990
    }
We could also call Movie.make_json_publicsafe(movie_doc) and DictShield will remove all fields except?what is listed in _public_fields. _public_fields is a white list, so we end up with only title and year left in the dictionary.
    {
        'title': u'Total Recall',
        'year': 1990
    }

James

People in this
Meetup are also in:

Log in

Not registered with us yet?

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy