Database Research Group

Welcome to the research homepage of the Database Research Group at UOIT.


We have a bunch of people.

  1. Ken Pu, faculty

    Associate professor in Computer Science
    Faculty of Science, UOIT

    Dr. Pu’s expertise is in the area of databases. He is particularly interested in algorithms and methods to handle very large scale data sets, such as Open Data, and data on the Web.

  2. Mohamed Helala, PhD candidate

    Mohamed has been working on scalable computer vision systems that are powered by controllable stream pipelines. His research draws from both computer vision and database systems.

    Mohamed is co-supervised by Dr. Faisal Qureshi, and is currently working at CIBC. He is expected to complete his PhD in the fall of 2018.

  3. Amin Beirami,

    Master in Computer Science
    Started in January 2017

    Amin has been working on integrating blockchain technology into the relational database system. His research objective is to create an immutable temporal database based on cryptographic signatures.

  4. Andrei Stoica,

    Master in Computer Science
    Started in 2018

    Andrei is interested in applying machine learning to perform deep analysis of Web data. In particular, Andrei is working on semantic understanding and topical analysis of text databases using embedding vectors and deep neural networks.

  5. Michael Valdron,

    Master in Computer Science
    Started in 2018

    Michael is working on the integration of classical artificial intelligence with modern database systems. His research is to create intelligent database engines that can perform active planning, optimization and other high-level reasonings.


We have a bunch of projects going on.

  1. Open Data

    The Open Data initiative refers to governments and instutitions releasing operational data to the public in a transparent fashion. For instance, the Canadian government is making nearly 1,000,000 data files available for public review. The topics of the Canadian Open Data range from agricultural development to government funding.

    The Database Research Group is motivated to invent new database technology to power the next generation of Internet scale open data initiative.

  2. Machine Learning & Databases

    ML and databases have always maintained a symbiotic relationship. ML needs the data from databases for training and verification, while the database engines should integrate ML as part of its standard data query facility.

    Our group is working on integrating ML as part of the data processing query pipeline.

  3. Constraints and Data

    Solving instances of problems that involve large volume of data is difficult. For example, it is a well known challenge to arrange university teaching schedules when thousands of students and hundreds of courses are involved.

    The Database Research Group is working on integrating planning, optimization and other reasoning capabilities into the data analytics pipeline.

  4. Your idea goes here

    Research is about exploration. While we are actively engaging in a number of on-going projects, the research group invites you to join us with original and exciting ideas to push the boundary of data driven systems.

    Let’s write elegant code, and build innovative systems.

    (let [languages ["clojure" "go" "python" "kotlin"]
          databases ["postgresql" "rocksdb" "kafka"]
          apps      ["web" "mobile"]]
      (for [lang languages
            db   databases
            app  apps]
         (println "We build" app "using" db "in" lang)))


You can access the teaching material at library

  1. Graduate Courses

    • CSCI 6220U: Advanced Database Systems
  2. Undergraduate courses

    • CSCI 2000U: Scientific Data Analysis
    • CSCI 3055U: Programming Languages
    • CSCI 4020U: Compilers



  1. Amin Beirami, KQ Pu and Y Zhu, Towards Optimal Snapshot Materialization to Support Large Query Workload for Append-only Temporal Databases, IEEE Service 2018, San Francisco, CA, July, pp. (4), 2018,
  2. A Hedrick, Y Zhu and K Pu, Modeling Transition and Mobility Patterns, International Conference on Applied Human Factors and Ergonomics, pp. 528-537, 2017, Springer, Cham
  3. E Reina, KQ Pu and FZ Qureshi, An Index Structure for Fast Range Search in Hamming Space, Computer and Robot Vision (CRV), 2017 14th Conference on, pp. 8-15, 2017, IEEE
  4. A Hedrick, KQ Pu and Y Zhu, Hierarchical temporal mobility analysis with semantic labeling, Computational Science and Computational Intelligence (CSCI), 2016 International Conference on, pp. 1321-1326, 2016, IEEE
  5. M Ferron, KQ Pu and J Szlichta, ARC: A pipeline approach enabling large-scale graph visualization, Advances in Social Networks Analysis and Mining (ASONAM), 2016 IEEE/ACM International Conference on, pp. 1397-1400, 2016, IEEE
  6. MA Helala, KQ Pu and FZ Qureshi, A Formal Algebra Implementation for Distributed Image and Video Stream Processing, Proceedings of the 10th International Conference on Distributed Smart Camera, pp. 84-91, 2016, ACM
  7. R Drake and KQ Pu, Using document space for relational search, Information Reuse and Integration (IRI), 2014 IEEE 15th International Conference on, pp. 841-844, 2014, IEEE
  8. MA Helala, KQ Pu and FZ Qureshi, Towards Efficient Feedback Control in Streaming Computer Vision Pipelines, Asian Conference on Computer Vision, pp. 314-329, 2014, Springer, Cham
  9. ER Reina, F Qureshi and K Pu, An Index Structure for Fast Range Search in Hamming Space, 2014, UOIT
  10. MA Helala, KQ Pu and FZ Qureshi, A stream algebra for computer vision pipelines, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 786-793, 2014,
  11. KQ Pu and R Cheung, Tag Grid: Supporting Multidimensional Queries of Tagged Datasets, Recent Trends in Information Reuse and Integration, pp. 331-342, 2012, Springer, Vienna
  12. WE Malloy and KQ Pu, Systems and computer program products to identify related data in a multidimensional database, 2012, US Patent 8,126,871
  13. MA Helala, KQ Pu and FZ Qureshi, Road boundary detection in challenging scenarios, Advanced Video and Signal-Based Surveillance (AVSS), 2012 IEEE Ninth International Conference on, pp. 428-433, 2012, IEEE
  14. A Hedrick and KQ Pu, Authoring relational queries on the mobile devices, Procedia Computer Science, pp. 752-757, 2012, Elsevier
  15. L Rachevsky and KQ Pu, Selection of features for surname classification, Information Reuse and Integration (IRI), 2011 IEEE International Conference on, pp. 15-20, 2011, IEEE
  16. KQ Pu and R Cheung, Tag grid: supporting collaborative and fuzzy multidimensional queries of tagged datasets, Information Reuse and Integration (IRI), 2010 IEEE International Conference on, pp. 364-367, 2010, IEEE
  17. KQ Pu, O Hassanzadeh, R Drake and RJ Miller, Online annotation of text streams with structured entities, Proceedings of the 19th ACM international conference on Information and knowledge management, pp. 29-38, 2010, ACM
  18. F Bourennani, KQ Pu and Y Zhu, Visualization and integration of databases using self-organizing map, Advances in Databases, Knowledge, and Data Applications, 2009. DBKDA'09. First International Conference on, pp. 155-160, 2009, IEEE
  19. F Bourennani, KQ Pu and Y Zhu, Visual integration tool for heterogeneous data type by unified vectorization, Information Reuse & Integration, 2009. IRI'09. IEEE International Conference on, pp. 132-137, 2009, IEEE
  20. F Bourennani, KQ Pu and Y Zhu, Unified Vectorization of Numerical and Textual Data using Self-Organizing Map, International Journal on Advances in Systems and Measurements Volume 2, Numbers 2&3, 2009, 2009,
  21. Y Zhu, W Howard and KQ Pu, Spatial inference using networks of RFID receiver: a Bayesian approach, Global Telecommunications Conference, 2009. GLOBECOM 2009. IEEE, pp. 1-6, 2009, IEEE
  22. KQ Pu, Keyword query cleaning using hidden markov models, Proceedings of the First International Workshop on Keyword Search on Structured Data, pp. 27-32, 2009, ACM
  23. KQ Pu and X Yu, Frisk: Keyword query cleaning and processing in action, IEEE International Conference on Data Engineering, pp. 1531-1534, 2009, IEEE
  24. KQ Pu, Analysis of Service Compatibility: Complexity and Computation, Services and Business Computing Solutions with XML: Applications for Quality Management and Best Processes, pp. 136-155, 2009, IGI Global
  25. KQ Pu, Analysis of Service Compatibility, Services and Business Computing Solutions with XML: Applications for Quality, pp. 136, 2009,
  26. Y Zhu and KQ Pu, Modeling and synthesis of service composition using tree automata, Information Reuse and Integration, 2008. IRI 2008. IEEE International Conference on, pp. 46-51, 2008,
  27. WE Malloy and KQ Pu, Methods to identify related data in a multidimensional database, 2008, US Patent 7,472,127
  28. Y Zhu and KQ Pu, Adaptive multicast tree construction for elastic data streams, Global Telecommunications Conference, 2008. IEEE GLOBECOM 2008. IEEE, pp. 1-5, 2008, IEEE
  29. KQ Pu, Service description and analysis from a type theoretic approach, Data Engineering Workshop, 2007 IEEE 23rd International Conference on, pp. 379-386, 2007, IEEE
  30. A Chandel, N Koudas, KQ Pu and D Srivastava, Fast identification of relational constraint violations, Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, pp. 776-785, 2007, IEEE
  31. KQ Pu and Y Zhu, Fast archiving and querying of heterogeneous sensor data streams, Digital Telecommunications, 2007. ICDT'07. Second International Conference on, pp. 28-28, 2007, IEEE
  32. KQ Pu and Y Zhu, Efficient indexing of heterogeneous data streams with automatic performance configurations, Scientific and Statistical Database Management, 2007. SSBDM'07. 19th International Conference on, pp. 34-34, 2007, IEEE
  33. K Pu, V Hristidis and N Koudas, Syntactic rule based approach toweb service composition, Data Engineering, 2006. ICDE'06. Proceedings of the 22nd International Conference on, pp. 31-31, 2006, IEEE
  34. QK Pu, On formal methods of multidimensional databases, 2006, University of Toronto
  35. KQ Pu and AO Mendelzon, Typed functional query languages with equational specifications, Proceedings of the 14th ACM international conference on Information and knowledge management, pp. 233-234, 2005, ACM
  36. X Yu, KQ Pu and N Koudas, Monitoring k-nearest neighbor queries over moving objects, Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on, pp. 631-642, 2005, IEEE
  37. KQ Pu, Modeling, querying and reasoning about OLAP databases: a functional approach, Proceedings of the 8th ACM international workshop on Data warehousing and OLAP, pp. 1-8, 2005, ACM
  38. KQ Pu, Functional Integration of Relational, OLAP and XML Data, Proceedings of VLDB Workshop on Information Integration on the Web (IIWeb-2004), pp. 97, 2004,
  39. AO Mendelzon and KQ Pu, Concise descriptions of subsets of structured sets, Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 123-133, 2003, ACM
  40. KQ Pu, Modeling and control of discrete-event systems with hierarchical abstraction, MA. Sc. thesis, Dept. Elect. Comput. Eng., Univ. Toronto, Toronto, ON, Canada, 2000,
  41. K Pu, Modeling and Control of Discrete-Event systems with Hierarchical abstraction. Ma sc, 2000, Thesis, Dept. of Electl. & Cmptr. Engrg., Univ. of Toronto
  42. KQ Pu, Theory Of Discrete Wavelet Transform And An Error Analysis Of The Pyramid Algorithm, 1998, Citeseer
  43. RJ Miller, F Nargesian, E Zhu, C Christodoulakis, KQ Pu and P Andritsos, Making Open Data Transparent: Data Discovery on Open Data,
  44. KQ Pu and Y Zhu, Efficient Indexing of Heterogeneous Data Streams with Automatic Performance Tuning,
  45. KQ Pu, Algorithm and Complexity of the Unification Problem of a Polymorphic Attribute-based Type System,


  1. F Nargesian, E Zhu, KQ Pu and RJ Miller, Table union search on open data, Proceedings of the VLDB Endowment, 11 (7) , pp. 813-825, 2018, VLDB Endowment
  2. E Zhu, KQ Pu, F Nargesian and RJ Miller, Interactive navigation of open data linkages, Proceedings of the VLDB Endowment, 10 (12) , pp. 1837-1840, 2017, VLDB Endowment
  3. E Zhu, F Nargesian, KQ Pu and RJ Miller, LSH ensemble: Internet-scale domain search, Proceedings of the VLDB Endowment, 9 (12) , pp. 1185-1196, 2016, VLDB Endowment
  4. Z Yu, Y Liu, X Yu and KQ Pu, Scalable distributed processing of K nearest neighbor queries over moving objects, IEEE Transactions on Knowledge and Data Engineering, 27 (5) , pp. 1383-1396, 2015, IEEE
  5. MA Helala, FZ Qureshi and KQ Pu, Automatic parsing of lane and road boundaries in challenging traffic scenes, Journal of electronic imaging, 24 (5) , pp. 053020, 2015, International Society for Optics and Photonics
  6. O Hassanzadeh, KQ Pu, SH Yeganeh, RJ Miller, L Popa, MA Hernández and H Ho, Discovering linkage points over web data, Proceedings of the VLDB Endowment, 6 (6) , pp. 445-456, 2013, VLDB Endowment
  7. K Q Pu, Recent Patents on Information Retrieval Using Natural Language and Keyword Query, Recent Patents on Computer Science, 3 (3) , pp. 186-194, 2010, Bentham Science Publishers
  8. Y Zhu, B Li and KQ Pu, Dynamic multicast in overlay networks with linear capacity constraints, IEEE Transactions on Parallel and Distributed Systems, 20 (7) , pp. 925-939, 2009, IEEE
  9. KQ Pu and X Yu, Keyword query cleaning, Proceedings of the VLDB Endowment, 1 (1) , pp. 909-920, 2008, VLDB Endowment
  10. KQ Pu and AO Mendelzon, Concise descriptions of subsets of structured sets, ACM Transactions on Database Systems (TODS), 30 (1) , pp. 211-248, 2005, ACM