whatspop


A log of interesting things made by Kunal Anand

ColdFusion Tagging Library

12/05/2005

If you're a Web 2.0 application, odds are that you allow "tagging." Here is my extremely low-fidelity approach that I whipped up in a few minutes using ColdFusion.

In setting up a tagging engine, we need to clarify the mechanics of the system. The following set of rules are loosely based on popular tagging implementations (including del.icio.us and Flickr):

  1. Tags are delimited by whitespace
  2. HTML tags are escaped
  3. Tags are all lower-cased
  4. Duplicate tags in the list are removed
  5. Tags are sorted alphabetically
Let's start by creating a list of tags:

<cfset tags_list = "   CaMELCaSE   comma, 'singlequoted' <b>bold</b> ""doublequoted"" double double  this should be over the limit now " />

You should see some good-looking strings, some HTML, and a duplicate value. Now it's our job to sanitize all of that into an acceptable format for data storage and browser display. Don't worry, it's not too much extra work.

We begin our code by defining a few constraints. Even though these are completely optional, I think it is good to prevent the user from entering lots of very long tags. The amount variable controls the number of tags and the length variable controls the maximum number of characters for a tag.

<cfset amount = 10 />
<cfset length = 20 />

Remember that rather strange tag list from above? It is time to normalize it using some of the loosely defined engine rules that at the start of this post.

<cfset tags_list = trim(tags_list) />
<cfset tags_list = lcase(tags_list) />

The first line trims all the preceding and trailing spaces. The second line changes the case for all the tags. Now we have to get rid of those duplicate tags (remember "double double" from the list above?):

<cfset tags_struct = structnew() />

<cfloop index="tag" list="#tags_list#" delimiters=" ">
    <cfset tags_struct[tag] = "" />
</cfloop>

<cfset tags_list = structkeylist(tags_struct," ") />
<cfset tags_list = listsort(tags_list, "textnocase", "asc", " ") />

For those that are interested, I decided to use structure keys for automatic de-duplication. Another possible way to de-duplicate the list is to instantiate another list variable, for temporary storage, and perform a nested loop routine that utilized the "listcontains" function. For my purposes, the method above is simple and involves the creation of a flat data structure. After the conversion back to a list, we sort the tags alphabetically. Now we move onto the end:

<cfif tags_list neq "">
    <cfif listlen(tags_list," ") lte amount>
        <cfset amount = listlen(tags_list," ") />
    </cfif>
    <cfloop from="1" to="#amount#" index="tag">
        *** Perform a database insert/update here ***
        <cfoutput>#htmleditformat(left(listgetat(tags_list,tag," "),length))#</cfoutput>
    </cfloop>
<cfelse>
    Sorry, you need to provide at least one tag
</cfif>

If the list does not hold a value, report an error. If it does, do an insert/update for each tag (cfquery) using the provided syntax to get at the current tag in the loop. Also, you can see how both the "amount" and "length" variables helped alter this process. Note, I prefer to take care of my HTML munging here. You could choose to deal with it when you are normalizing the tags. Finally, this is just a quick proof of concept - I have probably overlooked some security quirks and general ColdFusion issues, such as the pound sign.

Go play with the code! Incorporate it into some of your projects. If you want to help or manage to extend this, shoot me an email.