align-toparrow-leftarrow-rightbackbellblockcalendarcamerachatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-crosscrosseditemptyheartfacebookfullheartglobegoogleimagesinstagramlocation-pinmagnifying-glassmailmoremuplabelShape 3 + Rectangle 1outlookpersonplusImported LayersImported LayersImported Layersshieldstartwitteryahoo

Re: [ljc] Cookie Parse Regex

From: Neil B.
Sent on: Monday, February 4, 2013 12:56 PM
Way to write maintainable code!


On Mon, Feb 4, 2013 at 12:53 PM, Samir Talwar <[address removed]> wrote:
After trying it and seeing your point, this made no sense to me either, so I tried porting it to Ruby, which has a much more advanced regex engine than Java:

  COMMENT = "Comment"
  DOMAIN = "Domain"
  EXPIRES = "Expires"
  MAX_AGE = "Max-Age"
  PATH = "Path"
  VERSION = "Version"
  SECURE = "Secure"

  COOKIE_REGEX = /(?:^|\s)([^\s\(\)\[\]\{\}=,\"\"\/?@:;]+)=([^\s\(\)\[\]\{\}=,\"\"\/?@:;]+)
                   (?:
                     ;\s+(#{COMMENT}=[^\(\)\[\]\{\}=,\"\"\/?@:;]+)|
                     ;\s+(#{DOMAIN}=[^\s\(\)\[\]\{\}=,\"\"\\@:;]+)|
                     ;\s+(#{EXPIRES}=[^\(\)\[\]\{\}=\"\"\/?@;]+)|
                     ;\s+(#{MAX_AGE}=\d+)|
                     ;\s+(#{PATH}=[^\s\(\)\[\]\{\}=,\"\"\\?@:;]+)|
                     ;\s+(#{VERSION}=[\d]+)|
                     ;\s+(#{SECURE})
                   )*/ix

  COOKIE = "cookie_name=cookie_value; " +
           DOMAIN + "=www.cookie.com; " +
           COMMENT + "=cookie_comment; " +
           MAX_AGE + "=86399; " +
           PATH + "=/cookie; " +
           EXPIRES + "=Sun, 03 Feb[masked]:31:57 GMT; " +
           SECURE + "; " +
           VERSION + "=1"

  puts COOKIE_REGEX.match(COOKIE).captures

This works perfectly fine. I can only conclude it's Java's fault.

— Samir.


On Sun, Feb 3, 2013 at 7:57 PM, Karl Bennett <[address removed]> wrote:
So,

I'm trying to write an über regex to capture all the parts of a Set-Cookie header value.

This is what I've come up with:

private static final Pattern COOKIE_REGEX = Pattern.compile(
    "(?i)(?:^|\\s)([^\\s\\(\\)\\[\
\]\\{\\}=,\"\"\\\\/?@:;]+)=([^\\s\\(\\)\\[\\]\\{\\}=,\"\"\\\\/?@:;]+)" +
    "(?:" +
        ";\\s+(
Comment=[^\\(\\)\\[\\]\\{\\}=,\"\"\\\\/?@:;]+)|" +
        ";\\s+(
Domain=[^\\s\\(\\)\\[\\]\\{\\}=,\"\"\\\\@:;]+)|" +
        ";\\s+(
Expires=[^\\(\\)\\[\\]\\{\\}=\"\"\\\\/?@;]+)|" +
        ";\\s+(
Max-Age=\\d+)|" +
        ";\\s+(
Path=[^\\s\\(\\)\\[\\]\\{\\}=,\"\"\\\\?@:;]+)|" +
        ";\\s+(
Version=[\\d]+)|" +
        ";\\s+(
Secure)|" +
        "(this is here to force the Secure capture. Don't know why it's needed.)" +
    ")*");

// Raw regex.
// (?i)(?:^|\s)([^\s\(\)\[\]\{\}=,""\\/?@:;]+)=([^\s\(\)\[\]\{\}=,""\\/?@:;]+)(?:;\s+(Comment=[^\(\)\[\]\{\}=,""\\/?@:;]+)|;\s+(Domain=[^\s\(\)\[\]\{\}=,""\\@:;]+)|;\s+(Expires=[^\(\)\[\]\{\}=""\\/?@;]+)|;\s+(Max-Age=\d+)|;\s+(Path=[^\s\(\)\[\]\{\}=,""\\?@:;]+)|;\s+(Version=[\d]+)|;\s+(Secure)|(this is here to force the Secure capture. Don't know why it's needed.))*

Now if you run this regex over a sample cookie value such as:

cookie_name=cookie_value; Domain=www.cookie.com; Comment=cookie_comment; Max-Age=86399; Path=/cookie; Expires=Sun, 03 Feb[masked]:56:51 GMT; Secure; Version=1

It works fine and we get the following captures:

[
    "cookie_name=cookie_value; Domain=www.cookie.com; Comment=cookie_comment; Max-Age=86399; Path=/cookie; Expires=Sun, 03 Feb[masked]:56:51 GMT; Secure; Version=1",
    "cookie_name",
    "cookie_value",
    "Comment=cookie_comment",
    "Domain=www.cookie.com",
    "Expires=Sun, 03 Feb[masked]:57:52 GMT",
    "Max-Age=86399",
    "Path=/cookie",
    "Version=1",
    "Secure"

]

Now as you might have already noticed there is a rather odd capture at the end of that regex. The problem is, if it's not there the "Secure" fragment won't be captured. Could someone please tell me why this is? Is my regex wrong? It is a little bonkers.

Here is some starter code to help with debugging:

public class CookieRegexTest {

    public static final String COMMENT = "Comment";
    public static final String DOMAIN = "Domain";
    public static final String EXPIRES = "Expires";
    public static final String MAX_AGE = "Max-Age";
    public static final String PATH = "Path";
    public static final String VERSION = "Version";
    public static final String SECURE = "Secure";

    private static final Pattern COOKIE_REGEX = Pattern.compile(
            "(?i)(?:^|\\s)([^\\s\\(\\)\\[\\]\\{\\}=,\"\"\\\\/?@:;]+)=([^\\s\\(\\)\\[\\]\\{\\}=,\"\"\\\\/?@:;]+)" +
                    "(?:" +
                    /**/";\\s+(" + COMMENT + "=[^\\(\\)\\[\\]\\{\\}=,\"\"\\\\/?@:;]+)|" +
                    /**/";\\s+(" + DOMAIN + "=[^\\s\\(\\)\\[\\]\\{\\}=,\"\"\\\\@:;]+)|" +
                    /**/";\\s+(" + EXPIRES + "=[^\\(\\)\\[\\]\\{\\}=\"\"\\\\/?@;]+)|" +
                    /**/";\\s+(" + MAX_AGE + "=\\d+)|" +
                    /**/";\\s+(" + PATH + "=[^\\s\\(\\)\\[\\]\\{\\}=,\"\"\\\\?@:;]+)|" +
                    /**/";\\s+(" + VERSION + "=[\\d]+)|" +
                    /**/";\\s+(" + SECURE + ")|" +
                    /**/"(this is here to force the Secure capture. Don't know why it's needed.)" +
                    ")*");

    private static final String COOKIE = "cookie_name=cookie_value; " +
            DOMAIN + "=www.cookie.com; " +
            COMMENT + "=cookie_comment; " +
            MAX_AGE + "=86399; " +
            PATH + "=/cookie; " +
            EXPIRES + "=Sun, 03 Feb[masked]:31:57 GMT; " +
            SECURE + "; " +
            VERSION + "=1";

    private static String printGroups(Matcher matcher) {

        StringBuilder builder = new StringBuilder();

        for (int i = 1; i < matcher.groupCount(); i++) builder.append(matcher.group(i)).append('\n');

        return builder.toString();
    }

    public static void main(String[] args) {

        Matcher matcher = COOKIE_REGEX.matcher(COOKIE);
        matcher.matches();

        System.out.println(printGroups(matcher));
    }
}


Cheers,
Karl




--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Karl Bennett ([address removed]) from LJC - London Java Community.
To learn more about Karl Bennett, visit his/her member profile
Set my mailing list to email me As they are sent | In one daily email | Don't send me mailing list messages

Meetup, POB 4668 #37895 NY NY USA 10163 | [address removed]





--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Samir Talwar ([address removed]) from LJC - London Java Community.
To learn more about Samir Talwar, visit his/her member profile

Our Sponsors

  • Our Blog

    Read the latest news from the LJC

  • RecWorks Ltd

    Fixing Tech Recruitment using the Power of Community

  • jClarity

    Java/JVM Performance Analysis Tools & mentoring for Java related matters

  • LJC Aggrity

    Our LJC Aggrity site contains blog posts from our members

  • LJC Book Club

    Our Book club with book reviews from our members

  • Devoxx UK

    Java Community Conference in collaboration with the LJC, 8-10th June 16

  • SkillsMatter

    "Host, help organise, promote, film many of our meetings."

  • IBM

    Build Enterprise-grade apps at start-up speed.

  • New Relic

    New Relic makes sense of billions of metrics a day in real time.

  • Hazelcast

    Hazelcast is the leader in operating in-memory computing.

  • Java.Net

    We are an official Java User Group recognised by Oracle's JUG program

  • JRebel

    Free 3 month J-Rebel license.

  • O'Reilly

    40% discount on printed books and 50% on e-books.

  • Craft Rebellion

    Your choice of fresh craft beer, delivered. For 10% off use ‘LJC'

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy