on Tuesday, June 23rd, 6:30 pm. The Toronto APL Special Interest Group presents A Panel Discussion on APL, J, OLAP and Data Mining Panelists and Speakers: 1. Dan King "Dyadic Transpose: My Favourite Primitive for Multi-dimensional Data Management" (short presentation) 2. Boyd Carter, DMR Consulting Group, Associate Director of Data Warehousing "Overview of Data Warehousing" 3. Edwin M. Knorr "Outliers and Data Mining: Finding Exceptions in Large Datasets" Follow-up: postscript version of Edwin's publication on this topic can be found at: http://www.cs.ubc.ca:80/nest/dbsl/public/vldb98.ps Also (a second short presentation): 1. Richard Levine "Delete Redundant Blanks: A Walk through APL Coding Examples, from the Past to the Present" ============================================================== Abstracts: 1. "Overview of Data Warehousing" Boyd Carter, DMR Consulting Group, Associate Director of Data Warehousing Machiavelli "There is nothing more difficult to take in hand, more perilous to conduct, or more uncertain in its success, than to take the lead in the introduction of a new order of things" Machiavelli could have been instructing the young Prince in the building of a Data Warehouse. Boyd Carter will identify one of the key streams of activity that can turn this negative view into a positive one. If you don't know where you are going, practically every path you take to get there will be wrong, especially when the path has as many branches as the Data Warehousing path. He will describe how you can make a business case for Data Warehousing (know where you are going), track the probability of success through the construction of the Data Warehouse (take the right paths) and ensure that upon completion, the benefits described in the business case are realized (arrive at the planned destination). Boyd believes that: "There is nothing more challenging to take in hand, more exhilarating to conduct, or more rewarding in its success, than to take the lead in the introduction of a new order of things" - A Data Warehouse. 2. "Outliers and Data Mining: Finding Exceptions in Large Datasets" Edwin M. Knorr What do Wayne Gretzky, Chris Osgood, Alexander Mogilny, Ray Bourque, and Vladimir Konstantinov have in common? Answer: They're all outliers. In some data mining applications, the patterns in the data are well established, but it is the exceptions to those patterns that are most interesting. The identification of outliers can lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card fraud, and even the analysis of performance statistics of professional athletes. In this talk, we study the notion of DB (Distance-Based) outliers. Using examples, we show how to identify all distance-based outliers in large, multidimensional datasets. Our cell-based algorithm guarantees at most 3 passes over a dataset, and is appealing for disk-resident datasets of dimension k < 5. (Our nested loop algorithm is best for k >= 5 dimensions.) This talk will describe outliers, data mining, existing statistical approaches for detecting outliers, an overview of our cell-based algorithm, performance results, and a case study involving NHL player statistics. Ed Knorr is a PhD candidate in Computer Science at the University of British Columbia. His thesis is "Outliers and Data Mining: Finding Exceptions in Large Datasets". Ed also has an MSc degree from UBC and a BMath (co-op) degree from the University of Waterloo. He has over 10 years of experience as a systems programmer and database analyst in large, corporate environments. ============================================================== Date: TUESDAY, June 23, 1998 Time: 6:30 pm Location: ROOM V-154 (Television Studio "C" - ground floor, North side of the building, past elevator) Rogers Communication Building 80 Gould Street (North-East Corner of Church & Gould) Toronto, Ontario ============================================================== The Toronto APL Special Interest Group http://www.torontoapl.org