wiki-thoughts: February 2003

Thoughts on Structured/Unstructured and Shared Information Management using Wiki and other emerging technologies

Wednesday, February 19, 2003

I was browsing through Nilesh's blog (techie, works at the company that is rolling out the biggest CDMA launch:-). A nicely written article on "Swarming" - a methodology in understanding complex systems, and coming out with solutions that work. Here is the URL of the log (and he gives nice examples):
Swarming, KM and Reliance

And his main question is: How come the big company, with very complex patterns of working, still produce great results? And he seems to find answers at http://www.lse.ac.uk/LSE/COMPLEX/Seminars/2001/report_19march01.htm. (Seminar Notes On 'Clustering and Swarming as self-organising techniques in virtual communities' by
by IBM's David Snowden.)

Some interesting notes on how Reliance actually practices some of these things.

permalink 5:12 PM 0 Comments

Tuesday, February 18, 2003

Wiki (and twiki in particular) are very good for "Information Organization". I am trying to get a better term; but let us go on. Please note: I am quoting the use case from a typical organization setup.

You are a service within a organization. You maintain information in an internal website, and you create some new pages related to some new activity. And then you use email to send the URL of the start page for this activity. And people are happy; they can access the pages.

Well, what is forgotten here? If I have to come back after few days, there is no way to remember the URL. I have to access the specific mail (which is more and more difficult these days - with so much of information overload.) More importantly, the information is limited to only specific users who received the mail in the first place.

What is the right way to do this? You want an area in Home page of the site, in which you put new 'Start page' links for any new activity taken up. In twiki, this is too easy to do: Just refer to the new page (though it is not created as yet) from home page, and click on '?' accompanying this link to create the actual start page for the activity.

Now what is the right term for this? Information management? - too general. Information Access Management? - too much managerial. OK, forget it.

BTW, Portals also allow you to do the same. But initiating a portal in an organization setup is usually a very big affair, costing lot of money and time. In such a case, adding a link becomes a business process. Why can't we do some things by common sense?

And geeks: Don't assume this point is too silly to list here. We are one of the top organizations, and I find it very difficult to get this conveyed to a lot of people. And of course, they are good in their own field.

permalink 5:11 PM 0 Comments

A good article on Backlinking: http://lightningfield.com/david/clips/0211backlinks.html. I got this from: http://www.disenchanted.com/dis/linkback.html.

Backlinking is perhaps the next revolution on web, after forward linking to different sites from a web page. Suppose I write an article. Say, some people read it and refer to this article from their weblogs or other sites. And I would like to know who all have done this type of linking (so I am able to read their views.) Now, either the web-based applications should co-operate (and this is easy when same application in all the sites), or a third party should visit both sites and somehow connect them. The latter is what is already possible on the web: The readers of the other sites will follow the link, and visit my site to see the complete article. In this process, sufficient information is collected and passed on by the browser of the visitor. This information is then made available in real time from my site.

Isn't that a revolution?
permalink 3:06 PM 0 Comments

Weblogs and TWiki Integration: Some links

These came off the topics I read in twiki.org.

http://www.lathi.net/twiki-bin/view/Main/BlosxomAndTWiki : Matt Wilkie writes about how he made blosxom render wiki-formatted information. (This weblog is in perl, and uses standard file formats - that is what most of us like!)

And, http://www.decafbad.com/news_archives/000280.phtml#000280 "TWiki + Automatic MT-Search = WeblogWithWiki" notes.

http://www.decafbad.com/news_archives/000244.phtml#000244: Movable Type plugin for Wiki Formatting and XML-RPC Filtering Pipelines. Came to know that there is a Text::WikiFormat perl module.

http://www.decafbad.com/twiki/bin/view/Main/WeblogWithWiki: Article on Weblog with Wiki concepts.

permalink 2:42 PM 0 Comments

I need to introduce IMV at some or the other point in time. (http://imv.sourceforge.net/).

IMV stands for Information Meta View. It is many things in one: A personally owned and managed data source, a data model in the form of graph (as opposed to tree in LDAP and relational tables in databases), a structured data aggregation framework among diversely distributed IMV data sources, and so on.

What is the main purpose? To have a framework that allows us to manage and share information in dynamically formed P2P networks. For example, if everyone shares information about the books they own (or have read) etc., it can result in a massive distributed database. The key point to note is that every individual appropriately maintains the information she owns, and subscibes/shares information that they like from among friends and contacts. We standardize the process of sharing, the data model, and capability to evolve meta data so that sharing actually happens.

This project started off some 3 years ago. It has survived till date mainly through project undertaken by engineering students. One version was also sponsored by the company I work for.

Over the time, we have learnt a lot of things. Firstly, Semantic Net does this and more. Neverthless, we don't actually have a system in place that allows us to do the things that we have envisaged. So the project is still valid.

Second, for the product to succeed in real world, it should follow rules of real world. It is opensource, and so, geeks will be the first to adopt it. However, we want to design in such a way that other communities - such as businesses, ISPs etc. find value in the model, and it should be easy for them to boot. Third, it is eventually intended for end users who are not computer savvy. They understand email well. And we want to extend the email model so that email can now be used with structured data.

Rishi Desai (http://rushi.desai.name/) and Pavan Reddy (at Persistent) - who created a WebDAV compliant IMV server. This server is now available and used by other projects. (This was first successful demonstration, and won couple of prizes too.)

Priyanka Grover (who was at Persistent earlier, and my wife) created the first server with IMAP as backend, and a Template based system as front end - with IMV metadata and tree exposed to people. Obviously, we had to learn lessons in UI: The tree view is not the right way to present the information to users.

A group from AIT, Pune (2001-2002 batch): Vinayak Sharma (vinayak_ait@hotmail.com), Amrit Pal Singh (amrit_sarao@hotmail.com), Indervir Singh (aitproject@hotmail.com), Navneet Singh Waraich (nswaraich@hotmail.com) - contibuted IMV Browser component in Mozilla 0.9.5 using XUL. The idea here was to be able to create dynamic tables from IMV data. IMV is an arbitrary graph. However, the views need to be table centric; they can then be published on web. Vivek Shende (mailto:svivek@persistent.co.in) co-ordinated this effort.

In the current year, we have two main projects:

A data store for IMV and any structured data, with some interesting characteristics. The group from PICT (anurag_chakravarti2002@yahoo.com and others) are working on this. The key point here is that we want applications to use "Structured Data Models" to interact with secondary storage. Further, we want to make it easy for end users to manage this type of storage. An example: The email client that uses this data model for the email lists will not see a file on disk. It will instead see a list that has no bottom. You put a new CD that contains a part of your emails (say from specific month during last year), and the application will instantly see the list grow - all be it with gaps that correspond to other CDs. Increasing volume of information will require a new model, and this is what we are trying to come up with.

Second project concerns modelling of IMV interfaces as seen from different stakeholders. Firstly the end users: They should just see an extension of email paradigm: They can compose a imvMail that asks "Does anyone have this book?" i.e. it requests the recipients to submit structured data to their own IMV data stores, and then allows that information to be aggregated (from the replies sent back) into a table. And then, there are 'application providers' who provide applications such as Books. (Application here is a meta data set with some basic workflow.) And then there are IMV server hosting people - like the web server hosting providers which make it possible for people to get IMV account, much like they get an email account. Note that we don't want centralized system such as yahoo or hotmail: The system has to be as distributed as possible. Two people from MIT, Pune (Amber Saxena - ambarseksena@hotmail.com is contact point.)

We are toying with different transports. First it was WebDAV protocol. But it is not popular. So now, we feel that we just use email transport, with specially formed MIME messages that carry IMV messages. And eventually we want to pave the way for email clients to enhance themselves to manage IMV data as well.

We are also working on IM based interfaces to IMV. This will be more natural language based interaction with IMV data store.

More on this later.

permalink 12:18 PM 0 Comments

http://www.mindjack.com/feature/spin.html: "Spinning the web: The realities of Online Reputation Management" by Nicholas Carroll.

Summary: Human beings take various cues in determining the reputation of person/organization they interact with. Direct interaction (face to face, or by phone) is very successful model, and involves taking cues from gestures, facial expressions etc. How did it change when email, usenet, web (and now weblogs, wiki) happened? Since people depend on written word alone, the reputation management becomes trickier. It can make or break a successful web venture.

The article talks about how web affected industries' reputation: For travel industry, it was downside and led to realistic price levels. On other hand home retailing industries such as L.L.Bean etc. could manage and enhance their reputation.

Two news articles related to facial expressions today:

"Secret revealed of Mona Lisa smile":
http://news.independent.co.uk/world/science_medical/story.jsp?story=379381.

Also: http://timesofindia.indiatimes.com/cms.dll/html/uncomp/articleshow?artid=37803371: "Scientists Unmask Face-Reading Secrets." This one even has reference to how some buddhist monks can accurately understand facial muscle contractions that last only 1/25th of second.

permalink 10:38 AM 0 Comments

Friday, February 14, 2003

Found a good weblog (hosted in radio): Internet Technology Watch. http://radio.weblogs.com/0100746/. Refers to wiki also: Why not have XML schema to talk to wiki servers?
permalink 1:36 PM 0 Comments

Theme: Information Management in various environments - viz Email, Wiki, Desktop, Web etc.

We will talk about: Information management in different environments. Primarily: Email, desktop, web, wiki. And can users get to keep their environment, and yet be able to achieve all objectives of usual information management requirements?

Most 'active' people in industry move around with a laptop. And they love and live in email.

Why? The primary reason: They always have access to most important information - mails, contacts, important documents and so on. The laptop platform is hard to replicate on other platforms such as PocketPCs or Palms.

And Emails play an important role here. Let us look at this from perspective of Information management. What is good about email? (Compared to other information management platforms such as web ...)
- You are triggered when new information arrives.
- You have very good search
- Information is browsable based on time and folders
- You can 'relate' to any piece of information: It remains same since you last saw it. Nothing changes "under the hood".
- Information replicates in the Inbox. This means, your 'view' of information is intact even when source changes. This is, in general, a good thing.

And bad things about email (at least in existing clients):
- Browsability, unlike web, is linear.
- Relation between information - in form of links (as in web pages) can't be introduced, nor the system can automatically create them. At most, you have threads. (You can also sort by sender which is also a relation.) But the whole system remains static.
- And "one bucket" model is not suitable for information management. We require context sensitive views. If I am in home, I would like to see a different inbox. If I am meeting a specific client, I would like to have different view. And all automatically. (BTW, does anyone else feel this is a requirement?)

AvantGo

AvantGo-type functionality: You view the same links, but information in them can keep changing.

twiki and other wiki systems

Main problem with wiki systems is that information "changes under the hood" because anyone can refactor the information. The model is good to write shared documents, but not for information management of what individuals like. However, this is easy to fix.

Second problem is: Information replication. Because of shared nature, this part is indeed difficult. Note that in email you never have to 'integrate information'. You only create new information - referring to other information as necessary (say when you reply to mails etc.)

.. continues.

permalink 1:35 PM 0 Comments

Tuesday, February 04, 2003

IWantToWriteAboutThis: Aggregating Structured data by Email. Traditional: Use Excel sheets, aggregate them into a database. How to do it 1. Nicely? 2. Without creating dependency on specific products?
permalink 2:39 PM 0 Comments

Managing Structured data in TWiki

I keep working with twiki. The ideas keep coming at fast pace.

These days I am working on making structured data manageable in twiki (and in general, wiki) framework. The Topic names provide a uniform resource locator (URL; yes, sometimes you need to spell it to highlight the significance!) which act as achors to specific sets of data you are interested in. This data could be a simple list, a set of rows of a table (for which, say, you are administratively responsible), and so on. Due to plug-in nature, the data could come from a Database, and yet be available in format we choose to keep them within the topic.

So first part is: How do you keep structured data in twiki, and yet make it editable? TWiki chooses to use simple lists with capitalized words in cases such as Access controls, groups etc. I am trying to use for the same purpose. The reasonings are:

It is XML in disguise. Even though YAML folks claim to contrary. At least you get some power of XML, and you don't lose anything.

Hand editable, if really required

Good support for perl. In fact, you can parse the YAML data and directly create Perl structures.

So, I have chosen it for specifying any structured content in TWiki topic. (Typically wrapped with some markup so the plugins can identify that what follows is structured data.)

The next step is to be able to display this data nicely - in variety of ways. I now have a plugin to do exactly this. Currently it works with table data. My idea is to couple this with variety of CSS styles so user can choose how he would like to see the table.

How do we add the content to this table? Either handedit the topic. Or use forms interface. I have this done already. However, we have to think and see what are other options. For example, SVG is an emerging standard. XForms is also another standard.

And then we require integration with Databases. It is no good if my table is part of topic. Some people feel that databases are end-all of all structured data. Though I don't quite agree, we should anyway respect the fact that they are extremely well supported, and provide good integration with other tools in enterprise.

It is easy if my data has to reside only within a database. A plug-in can then act as intermediary. The topic only contains table name, database name and may be the set of columns. (Or may be the SQL query.)

But this is not acceptable. Why?

The twiki topic is a "View". Even though it is dynamic view, capabilities such as Search would be far more effective if the view is materialized, and available to Search and other twiki functionalities.

Integration with Email: TWiki model currently is centralized; meaning the information being processed remains central. But because it is collaboration tool, it has to, eventually, support synchronization between multiple installations. Specifically, we would like to have twiki topics delivered by email, and if required, edited and sent as email. This can only work if the topic views are same as their actual contents.

So, it becomes clear that twiki topics should have an internal cache, and mechanisms to update it. There is already a Cache plugiin. But this is not sufficient; we require good control over caching behaviour. I plan to do a plugin that allows you to do "Render and Cache" i.e. it has two arguments: the content to render (typically INCLUDE or SEARCH), and second argument is actual cache. It either shows only contents of cache, or it updates the cache with rendered twiki markup, and then shows the page.

More on this design later.

permalink 2:23 PM 0 Comments