Welcome, guest ( Login )

WikiHome » dojo.data » Dojo Data Meetings » 2007-03-06

2007-03-06

Version 9, changed by jaredj 03/07/2007.   Show version history

Agenda Item 1:  Status on outcome of last week's meeting.

  • Quick status on implementation of fetch and find sorting APIs as decided on in last week's meeting (Jared Jurkiewicz)
  • Discussion of any new thoughts/issues related to fetch/find
Result from meeting:
Agreement good progress was being made.  Jared J to send prototype implementation of fetch API to Brian Skinner for review. Prototype includes some basic range caching in the Result object when saveResult is true.  This is to help avoid having to go back to the datastore all the time.  There may be a range already fetched that contains the requested items.

Jared J to post patch to trackers above when there is good general agreement that the prototype looks good to consider committing in.

Jared J Also sent updated version to Douglas Hays for review as well.


Agenda Item 2:  Query syntax for partial matches.

Dijit needs datastore's to provide a consistent api for querying partial string values on a find() to support scenarios where Select widgets have their selectable items retrieved from a datastore.  Select widget provides a string value that is used as a query parameter to narrow the set of data items to be received.

For example, to find only the items where a data item's property value *begins with* the value "ca" or where a data item's property value includes "ca".

There are varying degrees of complexity for how query params can be handled by datasource in dojo.data.  The first thing we would like to decide is what degree of complexity is sufficient for this particular use case (Select widget string value as search param for dojo.data find()).

The following three options need to be discussed, to determine which (one or more) will be supported in dojo.data 1.0.

Which of these three possible capabilities shall be required of the datastore api for datastores to be required to implement?

Possible Capabilities (in order of complexity)

Option 1 (Agreed as general approach for 0.9/1.0 Simple filtering capability only.  enough for widgets like Combobox to work and the like.):

Simple wildcard pattern match for string properties (single character, and multi-character)

  • Multi-character pattern match syntax alternatives

    1. A) Use SQL style pattern characters 

      • store.find({query:{name:"ca%"}}); // or "%ca%"

    2. B) Use reg-exp style pattern characters

      • store.find({query:{name:"ca*"}}); // Or *ca*, *c*a*, etc.
  • Single-character pattern match syntax
    • store.find({query:{name:"?alifornia"}}); // match California or california or Xalifornia...  // Not for 0.9.  Only support *
Thoughts from Bill Keese:
That seems fine.   Easy to convert to a database query.  Users might try to type a more complicated query like "ca[a-z]*".  There are efficient ways to handle that too but I'd rather just say it's not supported for version 1.
Thoughts from Jared Jurkiewicz:
Keep it simple for 0.9 and 1.0.  I would vote for just basic support for *.  Maybe ?.  That would cover most general use-cases I would think. 
Thoughts from Brian Skinner:
I think there's an underlying issue here, namely: how do we reconcile the fact that we have
(a) some datastores that get data from databases which may be able to do pattern matching natively, and (b) some datastores that get data from web services which may not be able to do pattern matching natively. Do we try to make a pattern-matching API that attempts to hide the fact that sometimes the pattern matching is being done on the client and sometimes on the server?

Option 2:

Full regexp  

    • store.find({query:{name:/^ca/}}); // or /ca/
  • This capability could be difficult or impossible to implement server side in RDB datastores, but seems possible for non-RDB stores that are client-side.
Thoughts from Bill Keese:
I suppose that's fine too.  For a database store, a simply query like above could be converted to LIKE "ca%" and processed by the database; a more complex query would need to be processed in the app server.
Thoughts from Jared Jurkiewicz:
My concern with this one is that it could get extremely complex for some datastores to implement regular expression matching.  It may open up a can of worms. 

Option 3:
Parameterized native query string
find() can already support a "passthrough" to the datastore implementation.
ie. the string version of find() is a direct passthrough to the
datastore implementation

  store.find("gumption/art+history");
  store.find("SELECT * FROM FOO WHERE BLAH IN ('A','B') AND GLOM CONTAINS 'SPLAT' ");
  • Rather than specifying a native query string into the find(), an api could be provided so that a set of query strings could be registered with a datastore during it's initialization, where each string could contain one or more parameter, like:
On a delicious store,
  store.addQuery("query1","/");
On an RDB store,
  store.addQuery("WHERE user='') AND tags CONTAINS '' ");
  • During find(), parameters could be specified for each parameter, to be substituted into the predefined parameterized query strings. ex.
     store.find({query:{user:"gumption", tags: ['art','history']}});
  • For a partial search, a Select widget would pass it's values into the tags parameter, and the user parameter could be determined and set from some other variable (assuming tags are delimited with some separator char, such as space or comma).
Thoughts from Bill Keese:
Hmm, it's limiting because you are hardcoding which values you can query.  If there are 5 attributes, that makes a lot of possible queries.   I don't like that one so much.
Option 4:
Complex abstract dynamic query

      store.find({query:{name:{beginsWith:"ca"}}}); // or {includes:"ca"}
  • This capability would include additional operators for:
    • Boolean predicates (AND, OR)
    • NOT
    • LIKE
    • IN (simple sets from previous query result or string array values)
  • This capability is possible to implement server-side for RDB stores, but could complex to implement on clientside datastores (eg.JSONStore)

Agenda Item 3:  Layer partial match capability.

How do we layer this on top of a datastore that already has its own notion of what a query looks like? Do we think that most datastore-specific queries will be able to be mapped onto an abstract form of dynamic query (3), that can be specified in JSON syntax?  Or is the string substitution approach (4) to query better?

For example, here are some dojo.data.DeliciousStore queries:
store.find({query:"gumption"}); // "gumption" is a user name
store.find({query:"gumption/art+history"}); // find anything for gumption matching art history
store.find({query:"tags/gumption"});  // find all tags every used by user gumption
eg. 'Where user is gumption and tags in ("art", "history")

Thoughts from Brian Skinner:

// here's the simple vanilla DeliciousStore, which
// can't do partial string matching
var store = new dojo.data.DeliciousStore();
var allTagsForUserGumption = store.find({query:"tags/gumption"});

// but presto-chango:
var rawStore = new dojo.data.DeliciousStore();
dojo.data.magicDatastoreUpgrader(rawStore);
var allTagsForUserGumption = store.find({query:"tags/gumption"});
var matchingTags = store.find({query:"tags/gumption", filter:'ca'});
// where 'ca' could be something like /^ca/ or ca% or whatever

Or, alternatively, dojo.data.DeliciousStore could just inherit the client-side filtering capability from some abstract base class, so the magicDatastoreUpgrader() call isn't even needed -- at the cost of increasing the DeliciousStore code footprint for everybody who doesn't even want the filtering capability.

The downside is that the syntax for the find() args now has both a 'query' keyword and a 'filter' keyword, which may be conceptually ugly, or may be a pain for any datastore that actually can do server side string matching, where you really only want a single integrated query parameter.
Thoughts from Bill Keese:
> store.find({query:"gumption"}); // "gumption" is a user name
in JSON: {user: "gumption"}

> store.find({query:"gumption/art+history"}); // find anything for
> gumption matching art history

in JSON: {user: "gumption", content: "art history"}

> store.find({query:"tags/gumption"}); 
// find all tags every used by user gumption

in JSON: {tag: "gumption"}

Thoughts from Jared Jurkiewicz
I like Bill Keese suggestion.  Redo such datastores to take query parameters in the form of a JSON object with attributes.  The underlying datastore can map those into a query for the seryer and it ought to make it possible to handle wild-carding of attributes as well.

Result from Meeting:

Query should have some basic structure to them instead of any freeform text.  Basically, it should be a simple anonymous object of name/value pairs where each attribute represents only *one* thing.  For example:
store.find({query:"tags/gumption"});

Should be done as :

Find all tags for user 'gumption'.
store.find({query:{type: "tag  user: "gumption"}});
Or for example, find all tags for user 'gumption' that start with 'ca':
store.find({query: {type: "tag", user:"gumption", tags: ['ca*']}});
In this case, the delicious store can return tags or bookmarks, so the point of the 'type' attribute just tells the service what you want to return.  In the above examples, you want to return the tags.  In the case of returning bookmarks in all tags starting with ca*:  It would be like:
store.find({type: "bookmark", user:"gumption", tags: ['ca*']});

The underlying datastore should then map these single items into whatever query format the service actually uses.  This is rather like JDBC prepared statements in a way, where the datastore has a set of queries that it substitutes these values in.

That way this can have wild-carding portentially applied to any of the atomic items in the list, and the filtering is determined by the datastore itself.  Note that from the users perspective, whether the filtering occurs on the client or server cannot be determined from the usage.  Smart DataStores? should push filtering to the server when posssible.  They should also cache searches as much as possible to avoid having to go back to the server all the time, as well.




Transcript:

    [INFO]    Channel view for “#dojo-meeting” opened.
    -->|    YOU (jared_j) have joined #dojo-meeting
    =-=    Topic for #dojo-meeting is “http://dojo.jot.com/Irc070228”
    =-=    Topic for #dojo-meeting was set by wild_bill on Wednesday, February 28, 2007 6:03:37 PM
    *DojoMeetingBot*    This channel is logged - http://www.dojotoolkit.org/logs/meetings
    -->|    brian_skinner (n=brian_sk@adsl-68-122-193-57.dsl.pltn13.pacbell.net) has joined #dojo-meeting
    -->|    haysmark (n=chatzill@74-140-236-187.dhcp.insightbb.com) has joined #dojo-meeting
    <jared_j>    Give a few more minites for people to pop in before we get rolling?
    <brian_skinner>    okay
    <peller>    sorry, I'm going to be away from the keyboard for the first part of the meeting. I'll catch up
    <wild_bill>    hi. i guess so. is there a new agenda or just the old one?
    <brian_skinner>    new http://dojo.jot.com/2007-03-06
    <jared_j>    There's a new one out there.
    <jared_j>    Trying to track down Chris too. :)
    <wild_bill>    Jared - your mail was weird. The supposed hyperlink to the 2007-03-06 actually goes to 2007-02-27
    =-=    wild_bill has changed the topic to “http://dojo.jot.com/2007-03-06”
    <jared_j>    That's odd. I much have messed it up in gmail. sorry about that.
    <jared_j>    Most likely due to trying to get the mail out quicky this morning.
    <brian_skinner>    thanks for sending out the mail -- it's great to have you and chris driving this forward
    <wild_bill>    np. ok, let's get started.
    <jared_j>    Sure.
    <wild_bill>    Agenda Item 1: Status on outcome of last week's meeting.
    <wild_bill>    I don't have anything new there to discuss.
    <jared_j>    I do. Give me a sec to type it. :)
    <jared_j>    Allright, here's what I've been up to since we last met. I've prototyped out the fetch() api onto Result.js. As part of that I tried to come up with a cvery simple LRU cache that Result (and datastores) can use to cache items. As part of the work to get the protrotype working with JsonItemStore, I ended up fixing several bugs in SimpleBaseStore, plus I went and renamed the functions from oncompleted, to onCompleted, etc. I need to fix all the testcases for that bit still but I did put together a standalone JsonItemStore testcase now that follows the patterns of the others. I've got a ZIP of the prototype I can mail out if people want to see what files I've touched and my initial thoughts on how to do the fetch and a bit of caching to try and make it not go back to the store as often.
    <brian_skinner>    yup, i'd love to see what you've done, if you've got something ready to send out
    -->|    ornus (n=Slava@74-129-142-94.dhcp.insightbb.com) has joined #dojo-meeting
    <wild_bill>    It sounds like it's all working? At least well enough to develop widgets against?
    <jared_j>    I basically rigged Result to do 'range caching', so ifrequest ranges that are contained within ranges you've ooked up before, it can subset and return those, etc.
    <jared_j>    wild_bill: I sent a copy to Doug Hays to play with a bit to see if it works for comboox and such.
    <wild_bill>    that's good, but we need the query capability for combobox
    <jared_j>    And get some early feedback while I finish fixing the rest of the testcases to account for the function name switch.
    <jared_j>    And that's today's discussion and what I hope to implement next
    <wild_bill>    tom, on the other hand, should be able to use the current zip file for grid
    <--|    ornus has left #dojo-meeting ("Leaving")
    <wild_bill>    cool. ok, what else on fetch()?
    <wild_bill>    I'm confident that last week we had a good conclusion.
    <jared_j>    With the way i implemented it, DataStores will need to implement one function and attach it to Result on hwo to 'call back' to the Datastore to look things up not in the results cache yet. Same sort of idea as SimpelBaseStore and the _findItems implementation.
    <jared_j>    Or they could over-ride both user level fetch and callback if they want. Doesn't really matter.
    <jared_j>    I just tried t make Result generally usable as is without much, if any, DS attaching of things to it.
    <brian_skinner>    that sounds good
    <jared_j>    So i'lls send out the zip to Brian to look at. Should I send it to the general contrib list too?
    <wild_bill>    Hmm, i guess so. I'm not sure. Brian, if you like it then you can check it in?
    <brian_skinner>    yup, that sounds good -- and then later we can update all the other datastores as well
    <brian_skinner>    i created trac tickets for a bunch of the dojo.data work -- for the sorting and paging, as well as other stuff
    <brian_skinner>    jared_j, do you have a trac account? if you want, i can assign the sorting and paging bugs to you.
    <wild_bill>    oh come to think of it the patch should just be attached to trac; that's standard procedure.
    <jared_j>    I don't have a trac account yet no.
    <wild_bill>    you can still attach.
    <jared_j>    Do you want the patch as just the files or a diff-patch against the repository?
    <jared_j>    I can do either.
    <brian_skinner>    jared_j: i think you can get an account from dustin or alex
    <brian_skinner>    jared_j: i'm not sure what the standard dojo practice is
    <wild_bill>    probably patch file
    <jared_j>    In either event, I'll send you a copy of the files to look over, Brian.
    <brian_skinner>    cool, thanks
    <jared_j>    Let me do that right now, in fact...
    <wild_bill>    OK, Agenda Item 2: Query syntax for partial matches.
    <wild_bill>    To address what Brian said: I think there's an underlying issue here, namely: how do we reconcile the fact that we have
    <wild_bill>    (a) some datastores that get data from databases which may be able to do pattern matching natively, and (b) some datastores that get data from web services which may not be able to do pattern matching natively. Do we try to make a pattern-matching API that attempts to hide the fact that sometimes the pattern matching is being done on the client and sometimes on the server?
    <wild_bill>    Answer: yes
    <wild_bill>    that's the whole point of an API, to hide implementation details.
    <brian_skinner>    if that answer is yes, then does it make more sense to look at agenda item 3 before agenda item 2 -- it seems like we can't really deal with the query syntax without having a plan about the whole layering thing
    <jared_j>    Probably, as that will affect the current datastore impls.
    <wild_bill>    ok...
    <brian_skinner>    just as a place to start, can we talk about the delicious case...
    <wild_bill>    yeah. well, what we said originally was that not all datastores need to support the standard query syntax. you could obviously make a wrapper like you Brian suggested(magicDatastoreUpgrader)
    <ccmitchell>    hi all, sorry i'm late
    <brian_skinner>    hey ccmitchell
    <brian_skinner>    we're talking about agenda item #3 --- http://dojo.jot.com/2007-03-06
    <jared_j>    Taking the Delicious example, I like Bill's suggestion on breaking down the query into effectivelu attr/value pairs. Then let the uderlying DS take those to compose the query off of, having its knowledge ofof things. With simple name/attrs, it ought to be also possble to specify patterns in them as we need to try to decide in#2.
    <brian_skinner>    jared_j: i'd like to ask more about that
    <brian_skinner>    i'm confused
    <brian_skinner>    as it is now, to get all tags for user alex we do:
    <jared_j>    How so? (Mind you I'm not extremely familiar with Delicious, so it may be me missing something).
    <brian_skinner>    store.find({query:"tags/alex"});
    <brian_skinner>    with bill's approach we could instead do:
    <brian_skinner>    store.find({query:{tag:"alex"}});
    <wild_bill>    i think it would be store.find({query:{user:"alex"}});
    <jared_j>    Would it? Using store.find({query:{user:"alex"}}); oesn't tell you what about user alex to get. Or is it implicit?
    <wild_bill>    (although i'm also not familiar with delicio, but i'm thinking it's like flickr)
    <ccmitchell>    or store.find({query:{tags:["alex","chris","brian"]}});
    <brian_skinner>    hmm... that looks more like you're trying to get the user alex, rather than the tags that alex has created
    <wild_bill>    hmm
    <brian_skinner>    but anyway, we can pick whatever query syntax we like, and the DeliciousStore can convert that into the format that the web service wants
    <jared_j>    Right.
    <wild_bill>    yeah, i don't see any value to delicioStore having a special syntax
    <wild_bill>    so i agree
    <ccmitchell>    i like the object/prop way of specifying the q params
    <wild_bill>    i hadn't thought about a store having heterogenous data objects though.
    <jared_j>    I think the obj/prop way makes agenda item #2 easier to define, as well.
    <ccmitchell>    like the store-specific syntax, it assumes the caller has knowledge of the param names tho. How do i find out what the valid names for a store are?
    <wild_bill>    read the doc?
    <jared_j>    That gets back to the store meta-data we were talking about earlier.
    <brian_skinner>    wild_bill: in the end, i think all the different datastores will have different query semantics -- so i don't think we'll be successful if we try to unify the query syntax
    <wild_bill>    i'd like to have a common subset for most of the stores.
    <jared_j>    If we can't enforce some sort of uniformity at some level, I don't see how we can do #2.
    <ccmitchell>    i think you can "unify" for a certain subset
    <brian_skinner>    if delicious has the notion of tags, but CVS doesn't, then it doesn't matter what syntax we use for the query that asks for all tags
    <wild_bill>    huh? tags is just like a column name.
    <jared_j>    I don't think we need to decide on exact property names.
    <brian_skinner>    i don't think it's save to assume that our queries are just dealing with columns
    <ccmitchell>    right, there's no universal schema, but there can be a common way to specify query params
    <brian_skinner>    how does that help with the issue of partial string matches?
    <ccmitchell>    common, if not universal
    <wild_bill>    brian_skinner: because we can implement Item #2
    <ccmitchell>    do we need to capability to do a dynamic query expression, or do we just need a common way to provide params to an underlying query
    <ccmitchell>    i think the dynamic query would be overkill
    <wild_bill>    me too
    <jared_j>    Agreed.
    <brian_skinner>    i'd love to keep things simple
    <ccmitchell>    these are like prepared statements in jdbc
    <wild_bill>    ok, i am with jared, on Agenda Item #2 / Option #1
    <ccmitchell>    the underlying query is encapsulated in the prepared statment, and parameters are provided for what a client can provide
    <brian_skinner>    so, going back to the delicious example for a second...
    <ccmitchell>    so in the delicious example...
    <brian_skinner>    store.find({query:{tagsForUser:"alex"}}); // or something like that
    <ccmitchell>    the query expression is the "tags/"
    <jared_j>    Which is not visible to the user, right?
    <ccmitchell>    and the param prop you can provide is the tagnames property
    <ccmitchell>    right
    <brian_skinner>    sorry, i got lost -- what's not visible to the user?
    <ccmitchell>    so, we're not trying to unify the schemas or query expression languages... that's too specific to different store types
    <jared_j>    brian_skinner: the query expression is the "tags/"
    <jared_j>    That's not visible.
    <ccmitchell>    we just need a way for a client to pass the named params in
    <jared_j>    All the user asks is: {tagnames: [whatever]} or whatnot.
    <brian_skinner>    jared_j: so, like this, or not? store.find({query:{tagsForUser:"alex"}})
    <jared_j>    Right
    <jared_j>    Like that
    <ccmitchell>    params can be single valued or array (depending on the type of store and the schema of the data being accessed
    <wild_bill>    what kind of objects can delicio return?
    <jared_j>    Under the covers it subs that into the actual url/request structure.
    <brian_skinner>    wild_bill: delicious can return either a list of tags names or a list of bookmarks
    <wild_bill>    hmm
    <ccmitchell>    the set of query expressions themselves can be predefined (eg. hard-coded into the datastore impl in the delicious case)
    <brian_skinner>    okay, so then we also want to be able to do this:
    <brian_skinner>    store.find({query:{tagsForUser:"alex", match:"ca"}})
    <jared_j>    Match against what?
    <brian_skinner>    to get just the tags that start with "ca", or contain "ca", or whatever
    <ccmitchell>    when you provide the list of params in the query, a match can be made against the query expression to use, or a statement handle could be provided so that which query to use is explicit
    <wild_bill>    brian_skinner: that's not the syntax i was thinking at all
    <ccmitchell>    it seems like option 3 in Agenda item 2 handles the delicious case
    <wild_bill>    i'm sure there's a simpler way to handle delicio
    <ccmitchell>    store.find({query:{user:"gumption", tags: ['art','history']}});
    <wild_bill>    that returns bookmarks? or tags?
    <wild_bill>    I thought that the delicio store just returned bookmarks, in which case, the above query look great.
    <ccmitchell>    bookmarks
    <wild_bill>    but if delicio can also return tags ???
    <jared_j>    But ti can also return a list of tags the user defined?
    <wild_bill>    that complicates things....
    <brian_skinner>    yes it can also return the tags that a user defined
    <ccmitchell>    then you'll have to pass something in the query to indicate the type
    <brian_skinner>    which i think is the more relavent thing for this conversation
    <ccmitchell>    store.find({query:{type: "tag" user:"gumption", tags: ['art','history']}});
    <brian_skinner>    because it would actually make sense to want to connect a Combobox to a list of tags
    <jared_j>    And if you wanted to do a fuzzy match on tags, would it be:
    <jared_j>    store.find({query:{type: "tag" user:"gumption", tags: ['ar*','history']}}); ?
    <brian_skinner>    here's the use case i had in mind
    <brian_skinner>    there's a user, gumption, who has 30,000 bookmarks, tagged with 1,000 tags
    -->|    brian_skinner_ (n=brian_sk@adsl-68-122-193-57.dsl.pltn13.pacbell.net) has joined #dojo-meeting
    <brian_skinner_>    doh, i got cutoff
    <brian_skinner_>    the user starts to type a tag name into the combobox
    <ccmitchell>    ok, and we need a combobox that for this user allows him to type ahead into the set of 1000 tags?
    <brian_skinner_>    and from the combobox the user can select an existing tag that matches what they started to type
    <brian_skinner_>    once the user selects a tag, then they see a tableview that shows all the bookmarks that are tagged with that tag
    <wild_bill>    very nice.
    <brian_skinner_>    so, we need a way to ask for all the tags that start with "ca", or include "ca", or whatever
    <jared_j>    Would it be something like:
    <ccmitchell>    store.find({query:{user:"gumption", tags: ['ca%']}});
    <brian_skinner_>    but that's not a feature that the delicious web service offers
    <brian_skinner_>    so we need to layer that on top
    <jared_j>    store.find({query:{type: "tag" user:"gumption", tags: ['ar*']}});
    <ccmitchell>    or
    <brian_skinner_>    right, either of those query syntaxes looks good
    <jared_j>    And if you wanted bookmarks...
    <jared_j>    store.find({query:{type: "bookmark" user:"gumption", tags: ['ar*','history']}}); ?
    <ccmitchell>    store.find({query:{id: "matching tags by user" user:"gumption", tags: ['ca%']}});
    <jared_j>    Or somesouch
    <jared_j>    Er, what is that id: "matching tags by user" ?
    <wild_bill>    jared_j: sounds good. i don't know that delicio needs to support wildcard search for bookmarks like that
    <jared_j>    I don't follow its purpose.
    <ccmitchell>    if there is an ambiguity, you need a type or queryid
    <ccmitchell>    to differentiate
    <jared_j>    If it didn't then the dojo datastore would have to get all tags and filter the list itself.
    <ccmitchell>    this is essentially option 3, agenda item 2
    <ccmitchell>    a store can provide a set of predefined statements
    <brian_skinner_>    okay, but now ignore the query syntax for a minute
    <brian_skinner_>    let's just say for the moment that the query looks like this
    <ccmitchell>    passing queryid or type identifies which statement to pass params into
    <brian_skinner_>    > store.find({query:{user:"gumption", tags: ['ca%']}});
    |<--    peller has left irc.freenode.net ()
    <brian_skinner_>    my question is, how does that query get executed
    <wild_bill>    that's a query for matching tags, right?
    <jared_j>    It would depend on what fuctions in the interface of delicious there are.
    <wild_bill>    apparently delicio doesn't support wildcard searches on tags? so then the client has to do it.
    <jared_j>    If Delicious can't wildcard list tags, then the Delicous datastore would have to get all user tags and apply the filter after the fact.
    <brian_skinner_>    wild_bill: that's a query that asks for all the tags created by user gumption that start with "ca"
    <wild_bill>    brian_skinner: ok, what's the issue?
    <brian_skinner_>    right, the delicious web service returns the list of *all* the tags that user gumption has created, and then we have to filter it on the client side
    <wild_bill>    right
    <jared_j>    Okay. So ... the store can do that. :) It may be a bit slow for large numbes of tags, but it can do it.
    <brian_skinner_>    okay, the issue is, how do we set this up so that the person who writes the dojo.data.DeliciousStore can just write the code that talks to delicious, and doesn't have to worry about this filtering stuff at all
    <brian_skinner_>    how do we make it possible to write a datastore in under 30 lines of code
    <wild_bill>    client filtering library
    |<--    brian_skinner has left irc.freenode.net (Read error: 145 (Connection timed out))
    <brian_skinner_>    for reference, here's the current DeliciousStore code:
    <brian_skinner_>    http://archive.dojotoolkit.org/nightly/src/data/DeliciousStore.js
    <wild_bill>    shouldn't be that hard.... javascript supports regular expressions
    <wild_bill>    just need a flag somewhere that says "the server didn't do [all] the filtering so you have to do it on the client", and then have a base class or library function that filters down the list.
    <brian_skinner_>    wild_bill: that sounds good
    <wild_bill>    Not sure how it interacts with caching though; Maybe we should cache the whole result set?
    <jared_j>    Cache the whole thing, I would think.
    <ccmitchell>    so the client-side query "adapter" is what you were trying to show...
    <jared_j>    That way if another person come sin with user gumption and different fuiler.
    <jared_j>    Er, filter
    <jared_j>    No server call, just filter list in cache.
    <jared_j>    return
    <brian_skinner_>    so how do we tease apart the query expression into the one part that the DeliciousStore author cares about and the other part that adapter cares about?
    <wild_bill>    not only that. you are going to get a stream of queries from combobox: "c%", "ca%", "cal%" just for a single user typing into a combobox
    <ccmitchell>    takes results from previous queries to server side, and does additional filtering client side
    <jared_j>    wild_bill: Right, so you don't keep going back tot he sucker.
    <ccmitchell>    its a compound query
    <jared_j>    The SimpleLRUCache I used for fetch() could be used by a datastore to hold stuff like that.
    <wild_bill>    great
    <wild_bill>    brian_skinner_: i'm not sure. if there is a wildcard then don't add the parameter to the query??
    <wild_bill>    but do it client side
    <jared_j>    For some datastores, yeah, I think it woudl have to chheck the params
    <brian_skinner_>    um, what's "it"?
    <wild_bill>    you are asking if there is library support code to help the datastore author.
    <jared_j>    It beind the datastore
    <jared_j>    being
    <jared_j>    The datastore ultimately has to construct the query it sends to the server
    <jared_j>    Somehow.
    <wild_bill>    brian_skinner_: i think i understand what you are getting at but i don't have the answer yet, and it seems like an implementation detail.
    <brian_skinner_>    i'm asking whether (a) the DeliciousStore author is responsible for parsing all the query parameters, calling the web service, getting the results, and then calling thefiltering-library- helper function to do the filtering
    <wild_bill>    Initially, yes.
    <brian_skinner_>    or (b) there's some universal code that handles all that
    <brian_skinner_>    and then if the answer is (b)
    <brian_skinner_>    which is what i'd like
    <brian_skinner_>    how does that affect what the whole query API looks like
    <wild_bill>    hopefully not at all, right?
    <wild_bill>    it's still store.find({query:{type: "bookmark", user:"gumption", tags: ['ca%']}});
    <wild_bill>    s/bookmark/tag/
    <brian_skinner_>    wild_bill: i don't think there's any way to automatically tease that apart, without special knowledge about delicious
    <brian_skinner_>    whereas, i do think this would work:
    <brian_skinner_>    store.find({query:{type: "bookmark", user:"gumption", postFilter:{tagName:'ca%'}}});
    <brian_skinner_>    oops
    <brian_skinner_>    i mean
    <brian_skinner_>    store.find({query:{type: "tags", user:"gumption", postFilter:{tagName:'ca%'}}});
    <wild_bill>    sorry, i don't want the user to have to know what's postFilter and what's server filter
    <jared_j>    Agreed,
    <brian_skinner_>    right, that's the trade-off
    <wild_bill>    and we need a standard interface for combobox
    <brian_skinner_>    how much do we ask of users, and how much do we ask of datastore authors
    <wild_bill>    we ask more of the datastore authors :-)
    <jared_j>    Users shoudl have to do less than Datastoer authors.
    <jared_j>    developer versus user.
    <brian_skinner_>    okay, but it would be cool if we made it easy to write datastores, without having to think about paging/filtering/sorting/sync/async/etc. -- the world is full of thousands of different web services and databases and data file formats, and it would be great if it were *really easy* for someone to connect one of those things
    <wild_bill>    well, you don't have to support wildcards in your datastore... only if you want to use it with combobox
    <jared_j>    Yeah, you don't *have* to support filtering
    <wild_bill>    right
    <wild_bill>    this is like extra-credit for datastores :-)
    <brian_skinner_>    right, but why not set it up so that the filtering just happens automatically, for free
    <jared_j>    Same as caching. You don't have to cache, but you can.
    <jared_j>    But i tried to help the DSes by putting some caching in result. :)
    <wild_bill>    and at some point we will have libraries that help the datastore author do filtering.
    -->|    peller (n=peller@216-15-119-69.c3-0.nwt-ubr2.sbo-nwt.ma.cable.rcn.com) has joined #dojo-meeting
    <brian_skinner_>    so if the author of the CSV datastore doesn't write any code to deal with filtering, then it falls back to the generic filtering thing, and combobox still works
    <wild_bill>    sure, default client side filtering
    <jared_j>    Filtering requires knowledge about the data being returned. At some point the datastore has to do something (or the widget getting the data)
    <brian_skinner_>    i think filtering does not require knowledge about the data being returned
    <brian_skinner_>    example:
    <brian_skinner_>    say you're reading in a simple CSV file
    <brian_skinner_>    with info about different states: Alaska, Arizona, etc
    <brian_skinner_>    and it has columns for population, land area, etc.
    <brian_skinner_>    the CSV store shouldn't need to implement any query parsing code, or any filtering code
    <brian_skinner_>    and you could still support queries that look like this:
    <brian_skinner_>    store.find({query:{postFilter:{state:'ca%'}}});
    <brian_skinner_>    or
    <brian_skinner_>    store.find({query:{filter:{state:'ca%'}}});
    <brian_skinner_>    or
    <brian_skinner_>    store.find({filter:{state:'ca%'}});
    <wild_bill>    sure
    <brian_skinner_>    so, what i'm proposing is that we keep the query separate from the filter
    <jared_j>    Where you're saying filter is a keyword param that says the atte/value pair in me requires filtering.
    <jared_j>    Or somesuch.
    <brian_skinner_>    jared_j: yup
    <jared_j>    ?
    <jared_j>    Ok
    <jared_j>    Hm.
    <brian_skinner_>    or you could say it like this:
    <jared_j>    So you could have both a query: and a filter:
    <brian_skinner_>    store.find({filter:{attribute:'state', value:'ca%'}});
    <wild_bill>    huh?
    <jared_j>    Wouldn't it be easter to do it as attr:'value'?
    <brian_skinner_>    store.find({query:"some/file/path/states.csv", filter:{state:'ca%'}});
    <wild_bill>    the pathname is an argument to the store constructor, isn't it?
    <jared_j>    I think he's trying to imply:
    <brian_skinner_>    wild_bill: yes, right now it is -- but it could be in the query instead
    <jared_j>    store.find({filter:{state:'ca%'}}); /// Query for everything, then filter on attribute state
    <jared_j>    Where the store coulddo a fetchall, then filter
    <wild_bill>    guys, come on, you are mixing api and implementation details.
    <jared_j>    Or if smart, takes the filetr data and wires it into the remote query if possible to let the serverside filter?
    <brian_skinner_>    wild_bill: i think the choice of API ends up impacting what implementation options are available
    <ccmitchell>    i think this is too complex
    <ccmitchell>    keep it to basic named params, and let the ds impl deal with this
    <wild_bill>    right
    <ccmitchell>    if the impl ends up using some filtering libs and cache under the covers, that's fine
    <brian_skinner_>    but then you force the datastore author to implement filtering
    <ccmitchell>    or if it needs to chain together compound query results and do filtering, it can...under the covers.
    <wild_bill>    no, they can use a library.
    <brian_skinner_>    wild_bill: yes, but you force them to call the library
    <brian_skinner_>    they have to write code
    <wild_bill>    that's one line of code
    <ccmitchell>    focus should be on the application developer here in terms of the find api, not the datastore developer
    <wild_bill>    exactly
    <jared_j>    They have to write code anyway if the backend service itself supports filtering.
    <brian_skinner_>    jared_j: yes, but that would be a special case
    <jared_j>    Becuas ethey'd have toi intercept the filter:, not let the client do it, and toss it to the server
    <jared_j>    Would it?
    <wild_bill>    brian_skinner_: it's not a special case, it's the common case.
    <jared_j>    Any Database query generally supports wildcarding
    <brian_skinner_>    i don't think we should assume that most datastore implementations will be connecting to databases
    <wild_bill>    but most servers support query, like yahoo
    <brian_skinner_>    examples: CsvStore, OpmlStore, HtmlTableStore, XmlDataIslandStore
    <brian_skinner_>    many datastores won't even talk to serversx
    <brian_skinner_>    they'll just read from files
    <brian_skinner_>    or from data in pages
    <wild_bill>    it doesn't matter. having an easy api for users is paramount.
    <wild_bill>    implementation concerns like calling one line of code is secondary
    <jared_j>    Same idea in the difference of a 'user of a widget' and an 'implementer of a widget'
    <jared_j>    Users don't have to know how dojo works.
    <jared_j>    implementers do.
    <brian_skinner_>    okay
    <ccmitchell>    so can we summarize what we agree to agree on?
    <wild_bill>    ok, great, then i think we are in agreement about supporting an api like in Option #1 (but modified for delicio to list the type of return object also)
    <brian_skinner_>    right
    <ccmitchell>    y
    <wild_bill>    we didn't talk about "ca%" vs. "ca.*" but i think one of those is good
    <brian_skinner_>    do we want to say anything about what exactly the pattern match looks like
    <brian_skinner_>    jinx
    <wild_bill>    heh
    <jared_j>    So, what sort of options in filtering do we support?
    <jared_j>    I personally like * over %, but that's just me.
    <wild_bill>    .* ?
    <jared_j>    % gives me SQL shivers. :)
    <wild_bill>    or just * like on unix?
    <wild_bill>    rm *
    <ccmitchell>    * maps better to client-side filtering, % maps better to server side RDB query
    <jared_j>    I like ca*.
    <brian_skinner_>    are we just talking about strings, or are we also talking about numbers and dates?
    <ccmitchell>    since this is for client, lets go with *
    <wild_bill>    fine w/me
    <ccmitchell>    how to escape it?
    <jared_j>    string.replace("*","%") when it has t go to a DB query. Eaysy swap.
    <jared_j>    Or whatever the call is to replace.
    <ccmitchell>    eg how do i find tags with "*" in the name
    <jared_j>    Escape the * so it doesn't get treate as a wildcard?
    <ccmitchell>    vs. all tags
    <jared_j>   
    <jared_j>    \*
    <jared_j>    Common escape pattern
    <wild_bill>    sure
    <jared_j>    At least in JSON, etc.
    <ccmitchell>    or use unicode sequence? :)
    <wild_bill>    heh
    <wild_bill>    ok, sounds like a conclusion
    <jared_j>    Do we want to support ? too? Single char matching?
    <jared_j>    Or just do * for now.
    <wild_bill>    just do * for now
    <ccmitchell>    and star allowed pre and post string value
    <jared_j>    Ok. I usually * match when doing UNIX stuff anyway. Don't use ? that often.
    <jared_j>    Not in the middle?
    <ccmitchell>    or in middle
    <wild_bill>    but especially ca* and *ca* need to work
    <jared_j>    *foo, foo*, fo*o And so on.
    <ccmitchell>    *abc, abc*, a*c, what else?
    <jared_j>    *abc*
    <brian_skinner_>    so, like this: store.find({query:{name:"ca*"}}); rather than: store.find({query:{name:"ca.*"}});
    <ccmitchell>    that seems sufficient for typeahead case
    <brian_skinner_>    ?
    <brian_skinner_>    * instead of .*
    <ccmitchell>    brian: y
    <wild_bill>    brian_skinner_: yup
    <brian_skinner_>    okay
    <jared_j>    Yeah, think of it like matching a filename on disk
    <jared_j>    ls *foo.txt Or ls foo*
    <jared_j>    Or whatever.
    <wild_bill>    ok, so what we need now is a reference implementation of this for JsonItemStore
    <ccmitchell>    in the case where a ds cant support the * wildcarding, it can do the match against the string, no error (but no matches)
    <jared_j>    Give me a fwe days.
    <wild_bill>    cool, thanks for volunteering
    <brian_skinner_>    yup, and also need to open a new trac ticket about adding this feature to the other datastores
    <jared_j>    I'm already in JsonItemStore playign with fetch anyway
    <wild_bill>    yup
    <jared_j>    I should have something by next meeting time I imagine
    <jared_j>    Hopefully sooner.
    <brian_skinner_>    cool, thanks jared_j
    <wild_bill>    remember that performance can come later; the urgent thing is having a reference impl so that widget development can go on.
    <wild_bill>    (and also, the other data stores can come later)
    <jared_j>    Right
    <jared_j>    I'm only workig on JsonItemStore right now anyway
    <jared_j>    Even with fetch
    <jared_j>    Since it's easy to roughout ideas on
    <wild_bill>    yup
    <brian_skinner_>    so i think addressed all three agenda items, yes?
    <jared_j>    Yes, I believe so.
    <brian_skinner_>    cool
    <wild_bill>    excellent
    <wild_bill>    great meeting everyone.
    <jared_j>    Progress! Always good.
    <brian_skinner_>    other stuff to talk about, or should we let wild_bill get some sleep?
    <wild_bill>    heh
    <jared_j>    One thing. Should we consider a SoC project involving dojo.data?
    <brian_skinner_>    thanks for making it to these meeting wild_bill -- much appreciated
    <wild_bill>    np
    <brian_skinner_>    jared_j: i proposed a few of them on the wiki page
    <jared_j>    Can you mail me or post me the link?
    <brian_skinner_>    http://dojo.jot.com/SummerOfCode
    <jared_j>    Ah, ok
    <brian_skinner_>    offline dojo.data project
    <brian_skinner_>    dojo.data.Overdrive project
    <brian_skinner_>    dojo.data.MediaWiki project
    <jared_j>    Ah, some DS impls.
    <jared_j>    Nifty
    <wild_bill>    nice
    <brian_skinner_>    i can be a backup mentor, but i can't be a primary mentor this year
    <ccmitchell>    how about your time-series data thoughts?
    <brian_skinner_>    ccmitchell: ouch
    <ccmitchell>    for SoC i mean
    <brian_skinner_>    ccmitchell: to discuss now, or as a SoC project?
    <jared_j>    SoC
    <brian_skinner_>    ccmitchell: i don't have solution to propose, so i'm not sure what i'd suggest as a project for the SoC student
    <jared_j>    Well, I think this is a good point to close the meeting on, then
    <brian_skinner_>    but it would definitely be great if dojo.data could return tiem-series data, and we could bind that to a line graph
    <ccmitchell>    is there any special considerations for querying time-series data? other than supporting from: and to: in a query to a datasource (like that return items must have a time field)?
    <ccmitchell>    is time-series data just data with a timestamp on each item?
    <ccmitchell>    eg log entries
    <brian_skinner_>    ccmitchell: maybe, but i'm not sure that time-series data is just data with a timestamp on each item
    <brian_skinner_>    you could have an item representing Kansas, with attributes like "population" and "land area"
    <brian_skinner_>    where land area has a constant value
    <brian_skinner_>    but population has different values associated with different times
    <ccmitchell>    gotcha, you want to assign different param values to your dimensions for visualization
    <ccmitchell>    err property values
    <brian_skinner_>    jared_j: we're using JsonItemStore in OpenRecord -- we started running into bugs, and i've been creating trac tickets for them
    <brian_skinner_>    i can fix them as some point, but i'm swamped right now with personal life stuff
    <jared_j>    Send me the #s. While I'm in there, if they're in an area of the code I have to touch anyway, I'll see if I can tweak it.
    <brian_skinner_>    feel free to fix any of them if you want to, if you're working in that code anyway
    <brian_skinner_>    #2541, #2542, #2546, #2547, #2553
    <jared_j>    Ok. I'll skim them before I get going on the filter stuff.
    <jared_j>    And see if any are quick fixes in the same sections
    <brian_skinner_>    jared_j: cool, thanks!
    <brian_skinner_>    ccmitchell: it would be cool if the dojo.data time series stuff could someday support the sorts of rich visualization stuff that Hans Rosling was showing off in his TED talk:
    <brian_skinner_>    http://www.ted.com/tedtalks/tedtalksplayer.cfm?key=hans_rosling
    <jared_j>    Well, I'm going to head out and find dinner.
    <ccmitchell>    yeah
    <brian_skinner_>    me too -- see ya later
    <ccmitchell>    g'night!
    |<--    brian_skinner_ has left irc.freenode.net ("Chatzilla 0.9.77 [Firefox 1.5.0.10/2007021601]")


Attachments (0)

  File By Size Attached Ver.