Data Miners

  • DANIEL PEEBLES FOR TIME

    (3 of 3)

    Most tantalizing for counterterrorism investigators are the possibilities of predictive analysis. Technology from digiMine helps the J. Crew website identify the piece of clothing a shopper is most likely to buy based on his or her previous purchases. A loan company using predictive-analysis software from Sightward, based in Bellevue, Wash., discovered that the No. 1 indicator of whether Web applicants will go through with a loan rather than merely check current quotes was whether they voluntarily identified their gender on the website.

    ABM, based in Nottingham, England, has sold its predictive software to both businesses and law-enforcement agencies, primarily in its home market. Police in Hampshire, England, used the system to analyze patterns behind a rash of burglaries within an apartment complex and determined which building was likely to be the next target. Police stepped up patrols there and arrested a man carrying out a computer at 4 a.m. He confessed to the other crimes.

    Even more useful is new software that links the databases of different agencies or divisions of large corporations. Such software languages as XML (short for extensible markup language) provide a universal translator that can reach into the oldest computer systems and make them comprehensible to customized search engines and data-mining applications. Autonomy has enabled employees of such megafirms as BP, General Electric and General Motors to retrieve information from their companies' mishmash of databases around the globe. By installing such software, GM was able to offer its employees easy access to each of its 700 intranets and thereby reduce the tech staff used to maintain these systems from 30 full-time positions to one half-timer.

    Autonomy has also linked the databases of 56 British police forces through a project with Unisys known as HOLMES II — after Sherlock, of course. HOLMES II allows officers in different departments to search one another's crime databases and uses artificial-intelligence technology to recognize the meaning of words from their context and make links between similar clues that may have been entered differently by different people.

    Americans got a glimpse of how such a system might work this fall during the Washington-sniper investigation. Two weeks into the shootings, Knowledge Computing, an Arizona company whose Coplink system has integrated police databases in Tucson, Ariz., and Phoenix, volunteered its software to help with the investigation. The system was set up in Montgomery County, Md., only a day before the arrests were made, so it did not play a role in solving the shootings. Working through the hundreds of thousands of leads that were entered into various police computer systems, however, Coplink noted that witnesses reported seeing John Muhammad's blue Chevrolet Caprice near two of the Washington-area shootings, and local police ran computer checks on his license plate at least three times during the killing spree.

    Such tip-sharing systems are expected to spread quickly among police agencies. But implementing them at the federal level will be a nightmare. The size, complexity and backwardness of some government systems mock any IT company's claim of scalability. Many FBI agents still use computers without point-and-click capabilities. And the IRS still stores many of its records on reels of computer tape.

    A key challenge will be better sharing of information without compromising its security and without violating laws that protect the privacy and civil rights of individuals. A CIA analyst can legally view only a small percentage of information within FBI computers. The complex system of security clearances locks down entire classes of data, often just for bureaucratic convenience. There are first steps toward solving the problem. Convera, based in Vienna, Va., is adding a new feature to its retrieval software that will automatically identify the classification level of a certain document and then distribute it to whoever is most likely to need it. Verity, based in Sunnyvale, Calif., already deploys for businesses software that hides classified documents from unauthorized employees.

    Ultimately, though, the biggest hurdle will be the size of the government and the sheer number of its discrete systems and databases. "This is a 10-year project," says digiMine's Fayyad. "Anyone who thinks that you can simply apply a piece of software on top of the data and then you're in business doesn't understand it." Fayyad has decided to take a pass on the government market because of its complexities and what he describes as the Bush Administration's failure to present a clear picture of the system it wants to create. But eager to take his place are scores of software companies looking for a market, any market, that shows signs of life.

    1. 1
    2. 2
    3. 3
    4. Next Page