Sunday, April 20, 2008

website update

I added about a half dozen pages to the site today. Check it out if you're interested. For my next update, I think next I'll try to get more detailed, and include specific examples.

Saturday, April 12, 2008

Smarter tools: An introduction

Computers are great tools which haven’t been fully exploited yet. So far software generally provides tools to enable you to tell the computer what you want, and the computer gives it to you. Word or Excel etc are gen purp apps that give you tools. SimplyAccounting and SolidWorks are more vertical market tools that permit more specialized operations, but they really don’t know what you are up to. QuickTax is a step better. Because tax forms are so standardized and the rules apply to everyone, software companies are able to create software which will figure out your taxes for you, and quickly. Because it knows the rules of tax declarations.

Knowledge is the future of software. If you could explain to a computer about the kind of work you do, tell it what your expectations are for the results, then you have a starting point. From that description we can start creating programs to help you get your work done. A computer program can “look over your shoulder” and tell you when you make a mistake. MS Word does this when you type something that isn’t a wordd. Instantly you know either you spelled it wrong or it’s a word MS didn’t put in its wordlist. In Excel, if you have a row of 10 numbers, and then create a sum to add 9 of those numbers, Excel will highlight the sum, and tell you that you missed a number. These tools work because MS programmed some basic knowledge about common usage into their program. But we can go further.

If you had a company with three divisions, Alpha, Bravo and Charlie, you might create a big spreadsheet with multiple pages, and showing info about various aspects of each division. Then one day you get a new division, Delta. You now need to go through a lot of work extending your spreadsheet in all its pages to include Delta. It could very easily happen that you miss a section, or that you add the section wrongly. And when your rapidly growing company adds Elephant and Foxtrot, the problem comes up again.

Knowledge-based software can help. We first create a tool which recognizes company division tables in spreadsheets. (This tool is highly reusable for all manner of spreadsheet analysis, so it’s not a waste. It’s also very easy software to write.) Then we tell the tool that all such tables should have 3 divisions (Alpha, Bravo, and Charlie). This is your knowledge base. The tool will instantly tell you that all tables are good. Now you change the knowledge base to mention Delta division. Immediately, the spreadsheet analysis tool knows all your tables are wrong. Using the tool you could quickly jump to each and every one so you can fix them. And the tool will tell you when you are done.

With a little more work, we can take this example a little further. Suppose we teach the tool about the consistent structure of these tables. About the formulas that need to be replicated. Again much of this teaching can be quite generic about spreadsheets in general, but some of it will be about your particular spreadsheets. Further, we instruct our tool on how to modify spreadsheets by adding rows. These are trivial functions. Finally, we tell the tool that upon request, it should modify all spreadsheet division tables according to the list of divisions. All the above programs are trivial and the program can be written quickly so that whenever you have a new division, you type its name, press a button, and bang! Every table in your spreadsheet has an extra line, and it’s correct. Now if it weren’t correct, our previous tool would detect faulty tables and tell you, and you’d know your updater has a bug.

Now for the magic: each of the simple bits of functionality programmed into our tool were about reading simple data, recognizing simple data, and making simple changes to simple data. In the above example all those items were programmed specifically for the company divisions spreadsheet tool. But all of the details can be abstracted away, and replaced by data specifying what to recognize, what to match, what to replace with what. And if we do that we end up with a tool that can be instructed on maintaining any spreadsheet – you just need to describe the patterns and formats.

And for one further level: A spreadsheet is just a data model. It can be described by bunches of objects with attributes. By abstracting away from the spreadsheet, we end up with a tool to match any data for any system. But so far, it only handles the kinds of patterns that a spreadsheet handles.

Aside: XML is a language which has been used for describing data, and is specialized for various forms of data. A spreadsheet XML format exists, and would work well for our purposes. Alternatively, APIs in the spreadsheet program can expose the cells and update mechanisms.

The program for describing spreadsheet patterns is quite simple. This is the case for most other commonly recognized patterns in many other applications. This is especially the case for precisely defined subjects where right and wrong ways of doing something are clearly specified. Those clear specs are exactly writable as knowledge for a computer to evaluate. Simple programs can be written to evaluate whether the data put in the computer (the CAD design, the source code, the process plan, the purchase order) meets the specifications.

Such simple programs enable evaluating all kinds of specifications, so that the engineer or accountant will know instantly what is not done, what is done wrong. There are limits though. Software can only evaluate what has been clearly told. It can only be told what the expert clearly understands, and fits the kinds of patterns the software has been programmed to recognize.

Nevertheless, a program that only knows some things can still be useful, because it will never miss on those things, and help you work better because you know longer need to pay attention to them.

Friday, April 11, 2008

Mass Comparative Analysis

There are several sites out there that help you to know what things are popular, starting with Google (PageRank is based on how many sites link to a page), and then Stumble, del.icio.us, and Digg. But to date I haven't seen any site that really lets people compare things side by side.

What I'm imagining is something like PC magazine's (actually, most computer mags) product comparison articles. In these articles, the authors try a number of related products, analyze the features, and then rate each product on those features. From these they produce a composite score. As a reader who is relatively ignorant of the tool being reviewed, I might decide that to investigate the tool scoring 9.7/10 overall. Meanwhile, a somewhat informed reader might decide that "indexed report speed" (I'm making up a random feature here) is the most critical feature and will opt to investigate the two products that ranked highest in that feature.

Meanwhile, over at BoardGameGeek.com, the operators have created a very impressive system for game lovers (and publishers) to add their favourite games, describe them, and break down the details. Once posted, others can add information, and can rate the games. Then they can construct lists of games (such as Alan's favourites, Very Fast 2 Player Games, Axis and Allies Variants, etc.) People can browse these lists to find games related to ones they already like.

These concepts need to be combined. Moving beyond board games, we could use a site to compare webapps. How does gmail compare to yahoo mail to hotmail? How does blogger compare to the other blog systems? What sites work well with Google Maps? The list goes on and on.

So in this hypothetical system, you enter your favourite sites. Another person adds feature lists to some of these. Someone else makes lists which puts several comparable sites side by side. Everyone rates the sites, and rates the features of the sites. People make comments and discuss the sites. Then we can all see feature list comparisons, find related sites quickly, and know in short order to which site should turn to help get moving in the direction of web-based backup, or remote access to my home computer or whatever.

If such a site got going, thousands of people would organize, catalog, describe, rank, and compare thousands of sites on the net. And millions would be able to visit for a very quick rundown of what's the best of what. Just like today they turn to PC Magazine for one comparison of 5 products, tomorrow they will turn to youcomparemillionsofproducts.com (or maybe the creators will come up with a better name) for instant comparisons of everything valuable under the sun.

You could search this database by name, tag, popularity (number of ratings), quality (avg number of stars), etc. And you will find what you want much faster than Google or Stumble could possibly achieve.

So, do you agree? Has this already been done and I missed it? If not, what will it take to get this going? Please share your opinions.

Alan

Labels: , ,

Tuesday, April 8, 2008

Quantity Aids Creativity

Lifehacks featured a thought that forms an interesting counterpoint to my last post. Try the link

Labels: ,

Monday, April 7, 2008

Group Authorship

I'm in the middle of reading this fascinating book, Wikinomics. It raises a lot of ideas about the phenomenon of mass collaboration, which of course gives me lots to discuss. As a demonstration of their concept, the authors created a wiki on their site and invited readers to create an extra chapter to the book.

This invitation had a measure of success, and they indeed got their chapter produced. It makes for an interesting read, and is packed with ideas and thoughts. Yet to my eyes it lacks the cohesion and clarity of thought represented in the book. The original wiki likewise exhibits the same phenomenon. Wikipedia on the other hand is relatively coherent. In order to achieve this they took advantage of a common understanding of the encyclopedia format, and made policies to guide the writers. Lots of standardized tags constantly advise readers how to improve pages, if they are so inclined.

But the real key to achieving quality is a dedicated core of editors constantly working to encourage and sometimes enforce the standards. This seems to be the the reality of wiki writing. It's good for gathering ideas, organizing ideas, and developing ideas. A bunch of people can work on a site and develop information rapidly. But when it comes to cohesion, clarity and quality, the multitude requires editors to reign in their work and bring it to order.


Yet I don't think the masses, even with editors, could have written the original book. Maybe I'm wrong, but it seems to me that there is something a small group or an individual can achieve that a group can't. I can't quite put my finger on it. Maybe it's "art". Not that the masses couldn't create art, but that it wouldn't be the same thing.

Labels:

Thursday, April 3, 2008

Beyond Mice and Menus

In 2005 at the University of Washington, Barbara Grosz of Harvard hosted a
colloquium on collaborative systems, where she talked about her research on getting computers to interact nicely with people.

As one example, she shows a program which works with a word processor to automatically lookup journal references when the operator types hints that a reference is needed. This is kind of software is relatively simple to construct (if you know some good sites to find the references), but can over time save researchers hours.

After this she discussed computer programs that work in a shared environment with people where each participant has its objective, and considers how programs can learn to cooperate with people by sharing resources to help both participants achieve their goals more efficiently than if they didn't collaborate.

These examples are just a glimpse into where software will be going in the near future. Check it out!

Generalized vs specialized software

Computers nowadays are considered an indispensable tool for getting work done, but few people think of computers as a collaborator in their job. This is largely because most software isn't designed to fill this role. Programs are written to be generic, used for any purpose which fits their form. Thus we have word processors, spreadsheets, and image editors.

The general perception is that more specialized software is too hard to write, and the market is too narrow to make it worthwhile to build specialized tools. The one area which moves contrary to this general trend is business information systems. These systems are designed to integrate the operations of large enterprises, and keep them operating efficiently. In their effort to automate, companies have had to make these systems more intelligent.

Programming intelligence in regular programming languages quickly became too complex especially when the rules of operation change quickly. Thus business rules were developed, to provide a more natural way of defining system behavior. These rules can be created quickly, and changed quickly, and with good programming skills advanced operations can be automated and linked together within the enterprise.

In the field of Computer Aided Design (CAD), knowledge-based engineering systems have been developed for 30 years now, and are hard at work in helping to design airplanes and cars every day. These complex systems integrate CAD platforms with standalone design-generation tools, and with advanced analysis tools. The engineer will use these tools to rapidly generate designs, and then use his own expertise to evaluate the results and try different options until the desired design is achieved.

The business and engineering systems are very different in many respects, but they share a common element that they allow knowledge to be entered at a high level. This knowledge enables the computer to carry out very specialized and intricate tasks quickly under the direction of the operator, or in response to their actions.

Information management and engineering automation are only a start. I believe that almost any task in any subject area can be improved by instructing computers about the task being pursued. Naturally, most tasks are beyond the ability of a computer to know everything about the job, and so only some aspects of the work will be programmed in. What is needed is that the software be well designed to work with you, helping you with what it knows, and when you move beyond what it knows it will not allow its limited knowledge to interfere in your ability to get the job done.

Labels: