MS (Data Science)
Objectives
Data Science has become an important due to the need for analyzing and understanding ever increasing data generated by a multitude of sources. By getting admission in the MS programme applicants will not only improve their qualifications but will also be equipped with the latest knowledge in the field. The students will be exposed to different aspects of data science including programming, data structures, and algorithms for data science, data analysis and visualization, and big data analytics. These aspects will provide the students the opportunities to effectively work in local and international markets as well as to pursue a PhD degree in data science.
MS (Data Science) Part Time
Quaid-i-Azam University is keen to serve the community and is providing an opportunity to those who are working in the related fields and may not be able to attend a full time post graduate programme. Classes will be held in the evening after 5pm. By getting admission in the programme applicants will not only improve their qualifications but will be equipped with the latest knowledge and techniques in the field of information studies
Programme Objectives
The main objectives of the programme are to enable students to:
- Analyze and solve problems related to data for local and international markets
- Effectively use appropriate techniques and tools used in data science
- Learn state of the art technologies associated with data science
- Pursue a PhD degree in data science or associated disciplines
Admissions Requirements
The minimum requirements for admission are:
- BS or MSc (16 years of education) in computing related disciplines OR
- BS or MSc (16 years of education) in non-computing disciplines having studied at least two mathematics related courses and one statistics related course in their terminal degree. Such students must pass non-credit deficiency courses of “Problem Solving and programming”, “Data Structures”, and “Database Systems”. These courses may be exempted, if the applicant has already studied them in his or her terminal degree.
- Must pass test and interview arranged by the department
Programme Structure
MS programme consists of 30 credit hours. Each course is of 3 credit hours and there are 2 compulsory courses and a number of optional courses.
MS (Data Science) programme with 2 options:
- 24 credit hours for the course work and 6 credit hours for project/thesis
- 30 credit hours for the course work
DSC-605: Programming for Data Science
Problem solving for data science, programming constructs, data structures (lists, sets, tuples, dictionaries), working with different types of data (tabular, semi-structured, unstructured), data wrangling (understanding data, transforming and structuring data, data cleaning, data enrichment, data validation), introduction to data analytics, data visualization basics.
DSC-606: Probability and Statistics for Data Science
Probability spaces, random variables, multivariate random variables, expectation, convergence, statistical models, estimation, hypothesis testing, Bayesian methods, linear regression, logistic regression, applications of probability and statistics in data science, applied data science case-studies.
Elective Courses
DSC-632: Advanced Web Development
Essentials of web development; JavaScript: basic concepts, synchronous and asynchronous programming, event loop, callback hells and promises; Server Side Scripting: Introduction, Backend architecture, REPL (Read-Eval-Print-Loop), server-side modules, synchronous and asynchronous CRUD (Create, Read, Update, Delete) operations; Data Handling; API Development: RESTful APIs, GraphQL, API testing; Middleware; Template engines; Databases: connecting backend with databases; Front-end Frameworks; Styling in frontend frameworks; State management; Frontend API access; Frontend state management
DSC-653: Natural Language Processing
Introduction to natural language processing, applications of NLP, regular expressions, regular expressions in NLP, text normalization, language modeling with n-grams, Naïve bayes classification and sentiment, POS tagging and HMM, Viterbi algorithm, CFGs and PCFGs, semantic analysis, text classification and evaluation, topic modeling, opinion mining, information extraction, named entity recognition, information retrieval, word sense disambiguation, question answering, dialog systems, Coreference resolution, text summarization, text alignment, language translation, latent semantic indexing, dimensionality reduction.
DSC-660: Research Methods
Overview of computer science sub-areas, Introduction to research methods. The objectives and dimensions of research. Tools of research. Key research areas of interest. The research problems: Finding a problem, stating the problem, identifying sub-problems. Review of related literature: Survey paper presentation. Planning the research project. The scientific method, Research planning, Data analysis and Statistical Methods. Conducting research in computer science. Proposal presentation. Research methodology: Quantitative and qualitative approach
DSC-665: Cloud Computing
Introduction to Cloud Computing; Virtualization; Cloud Stack; Infrastructure as a Service (IaaS); Platform as a Service (PaaS); Software as a Service (SaaS); Cloud Computing Platforms; Distributed Storage Systems; Parallel Programming in the Cloud; Mobile Cloud Computing; Cloudlets; Commercial Cloud Computing Platforms;
DSC-672: e-Government
Introduction to the Course; Overview of trends driving the development of Government/Non-Profit web Site and Analysis; Citizen Centric Web Design; Overview of Key of e-government Practices and applications: Citizen to Government, Business to Government, Government to Government; Policy Issues in e-government: Public Access & Government Transparency, Privacy and Security Issues; IT Management for Governments and Non-profits.
DSC-673: Multimedia Technology
Multimedia: Image acquisition and manipulation; Still Imaging; 3D & Animation: Moving image creation and manipulation; Video: acquisition and manipulation; Audio: acquisition and manipulation; Multimedia Platforms: Unix/SG, PC, Mac, Linux; Convergence, Internet & Interactive Media; Technical Issues and future trends
DSC-675: Information Retrieval Systems
Structure of IR Systems; Vector Space Models; Language Models; Evidence from Behavior; Evidence from Metadata; User Interaction; Indexing; Cross-Language Retrieval; Document Image Retrieval; Speech and Music retrieval; Photograph and Video Retrieval
DSC-731: Optimization Methods for Data Science
Linear optimization; unconstrained nonlinear optimization; nonlinear optimization algorithms; linear regression; nonlinear regression; logistic regression; numerical solution for linear systems; stochastic gradient descent; constrained nonlinear optimization; quadratic programs; application of optimization methods in data science
DSC-741: Computer Vision
Computer vision, geometric primitives and 2D transformations, image processing and filtering, Camera model, Convolution, feature detection, matching, model fitting, object recognition, supervised learning, linear classifiers, neural networks and convolution neural network, their applications in objection detection and localization, stereo, optical flow and tracking.
DSC-742: Internet of Things
Introduction to the Internet of Things; IoT and its importance; Elements of an IoT ecosystem; Technology and business driver; IoT applications, trends and implications; Basics of Networking; Understanding the abstraction of layers; TCP/IP protocols stack; SDN Architecture; Control and Management plane improvements with SDN; Network Automation and Virtualization; Arduino and Raspberry Pi; Sensors and sensor nodes; Sensing components and devices; Sensor modules, nodes, motes and systems; Connectivity and networks; Wireless technologies for the IoT; Edge connectivity and protocols; IoT exercises; Local processing on the sensor nodes; Connecting devices at the edge and to the cloud; Processing data offline and in the cloud.
DSC-772: Data Mining
The data mining process, Data preprocessing: Cleaning, Transformation, Reduction. Data warehousing: Multidimensional data modeling, OLAP. Classification: Bayesian, Decision tree. Prediction, Mining association rules: Association rules and types, Interestingness measures, multidimensional rules, Apriori algorithm. Cluster analysis: Partitional, Hierarchical, Density based. Mining complex data.
DSC-775: Digital Libraries
Introduction to digital libraries; A brief history of Digital Library development.; Planning a Digital Library; Presenting documents; Digital Document Formats: Markup and Metadata; Metadata standards; Interoperability; Digital Preservation, new media; Case Studies of Digital Libraries; Research Issues in Digital Libraries
DSC-781: Content based Information Retrieval
Introduction to multimedia content (i.e., text, audio, image, and video objects); multiple modalities of information (i.e., textual, acoustic, and visual); web and multimedia contents; indexing and retrieval of various types of multimedia-based content; multimedia content descriptors and feature sets; multimedia content modeling; structured and semi-structured data models; Content based indexing and retrieval models; CBIR architectures; multimedia information retrieval and ranking; content-based web search; CBIR evaluation; emerging trend in CBIR.
DSC-787: Special Topics in Data Science
Latest trends in data science; state of the art research in data science; data science case-studies; Topics which are not covered in other courses;
DSC-815: Neural Information Retrieval
Information retrieval fundamentals, information retrieval evaluation, word representational learning, word embeddings, language modeling, Word2Vec, FastText, word embeddings in information retrieval, query expansion with word embeddings, application to patent retrieval, neural networks, neural network methods in NLP, Sequence Modeling with CNNs (Convolutional Neural Networks) and RNNs (Recurrent Neural Networks): modeling word n-grams with CNN, hierarchical CNNs, recurrent neural networks, simple RNN, RNN as Encoder, LSTM, LSTM gating mechanism, encoder-decoder architecture, attention mechanism, encoder-decoder, and attention, Transformer and BERT: transformer architecture, contextualization via self-attention, transformer – positional encoding, masked language modeling, BERT, and extractive question answering, Neural Re-ranking: text-based neural models, properties of neural IR models, neural re-ranking models, re-ranking evaluation, KNRM (Kernel based Neural Ranking Model), and convolutional KNRM, Transformer Contextualized Re-ranking: web search with BERT, Re-ranking with BERT, splitting BERT - PreTTR and ColBERT, transformer-kernel ranking, TK (Transformer-Kernel) ranking, TKL (Transformer-Kernel for Long documents)and a hybrid approach IDCM (Intra-Document Cascade Model), domain specific information retrieval applications, Dense Retrieval Models – Knowledge Distillation: neural methods for IR beyond re-ranking, dense retrieval, dense retrieval and re-ranking, dense retrieval lifecycle, BERTDOT model, nearest neighbour search, nearest neighbour search – GPU brute-force, approximation of nearest neighbour search, knowledge distillation, DistilBERT, distillation in IR, deep learning based recommender systems.
DSC-816: Medical Image Analysis
Introduction to medical image analysis, medical imaging modalities, image acquisition techniques: MRI, CT, ultrasound, X-ray, challenges and characteristics of medical image data, role of medical image analysis in healthcare and medical research, medical image preprocessing, image denoising and artifact correction techniques, image enhancement methods, image segmentation in medical images, feature extraction and representation, image registration and fusion, classification and diagnosis, machine learning and deep learning approaches for medical image classification and diagnosis, computer-aided diagnosis (CAD) systems in medical imaging and performance measures.
DSC-817: Transfer Learning and Applications
Overview of machine learning and deep learning, neural networks and optimization algorithms, Introduction to Transfer Learning and pretrained models, evaluation and model selection, Feature extraction using pretrained models, Techniques for fine-tuning pretrained models, Unsupervised domain adaptation and self-supervised learning, multi-task learning and applications, Big Transforms, Vision Transforms, Transfer learning in computer vision, Challenges and considerations in transfer learning for computer vision tasks, Transfer Learning in Natural Language Processing (NLP), Fine-tuning pretrained language models for various NLP tasks, Applications of transfer learning in sentiment analysis, text generation, and question answering, Meta-learning and few-shot learning, Lifelong learning.
DSC-818: Advanced Computer Vision
Advanced Image Processing Techniques, Image enhancement: multi-scale processing, contrast enhancement, Edge detection, Image restoration, Feature Extraction and Description, Matching and alignment of image features, Image Segmentation, Semantic Segmentation, Object detection with CNNs, Transfer learning and fine-tuning pre-trained models, 3D object recognition and pose estimation, Video Analysis and Tracking, Generative Adversarial Networks (GANs) for image generation and style transfer, Attention mechanisms in computer vision, Visual understanding in specific domains: robotics, medical imaging, autonomous vehicles
DSC-819: Machine Learning for Software Engineering
Representation of problem and ML/DL models for Software Engineering tasks related to the following: Requirements Engineering (e.g. requirement tracing, requirement prioritization, requirement assessment), Design and Modeling (e.g. design pattern detection, software modeling, architecture evaluation), Implementation (e.g. code smell detection, code summarization, code comment management), Defect Analysis (e.g defect localization, defect categorization, defect cause analysis), Project management (e.g. software effort estimation, software crowdsourcing recommendations), evaluation approaches and measures, embedding techniques and pre-trained models for software, factors in selection of ML/DL, challenges in application of ML for SE, future research directions
DSC-823: Data Warehousing
Overview of the course and a brief history; Data Warehouse Architecture; Extract Transform Load; Data Cleansing Algorithms; Hot and Cold Data; Data Warehouse support for OLAP and Data Mining; Active Data warehousing; Semantic Data warehousing; Oracle solution; TeraData solution; Case Studies.
DSC-825: Machine Learning
Introduction: Overview of machine learning, Machine learning applications and examples, inductive learning, inductive bias; Decision tree learning: Representation, algorithm, hypothesis space, rule extraction, pruning; Neural networks: Representation, perceptrons, multilayer networks, backpropagation algorithm, Training procedures; Bayesian learning: Bayes theorem, maximum likelihood hypothesis, Bayes classifiers, Baysian belief networks; Instance-based learning: K-nearest neighbours, lazy and eager learning; Evaluating Hypothesis: Hypothesis accuracy, sampling theory, hypothesis testing; Computational learning theory: Probably learning an approximately correct hypothesis, sample complexity for finite and infinite hypothesis, VC dimension; Unsupervised learning: Clustering, types, steps
DSC-826: Pattern Recognition
Introduction: Overview of pattern recognition, review of matrix algebra, probability distributions and probability; Bayesian Decision Theory: Classifiers, discriminant functions and decision surfaces, normal density and discriminant functions for normal density, error bounds for normal densities; Bayesian Parameter Estimation: Univariate and Multivariate cases for the Gaussian distribution, bias, class-conditional densities; Maximum Likelihood Estimation: The general principle, Unknown parameter cases for Gaussian distribution; Problems of dimensionality: Accuracy and training sample size, computational complexity, overfitting; Component Analysis: Principle Component Analysis, Fisher Linear discriminant; Non-Parametric Techniques: Parzen windows, K-Nearest neighbour rule and estimation.
DSC-827: Social Network Analysis
Introduction to social networks, random network models, identifying connected components, giant component, average shortest path, diameter, preferential attachment, network centrality, betweenness, closeness, clustering, community structure, modularity, overlapping communities, small world network models, contagion, opinion formation, applications of social network analysis, social media networks.
DSC-828: Computational Modeling
Introduction to Computational Modeling, History, Motivation and Prospects; Modeling process, input modeling, Random Numbers and Distributions, Monte Carlo Methods; Discrete Event Modeling , Introduction to Markov Process, Queuing Theory; Continues Time Modeling, System Dynamic Modeling; Game Theoretic Modeling, Prisoner’s Dilemma, Nash Equation; Verification and Validation of Models; Verification and Validation Techniques; Data Driven Modeling ;Concept Learning and Decision, Classification and Clustering; Agent Based Modeling, Belief, Desire and Intension (BDI) , Multi-Agent System; Social Simulation , Social Dynamics, Complex Systems, Emergence.
DSC-829: Social Media Mining
Social media platforms; data retrieval from social media platforms; analyzing social media data; applications of social media mining: trend detection, fake posts and profiles detection, fraud detection, discussions analysis (like governance, businesses, politics); privacy for social media mining; social issues in social media mining;
DSC-8XX: Advanced Social Network Analysis
Overview of social network analysis; dynamic networks – representation, measures, centrality, connectedness, growth, applications of dynamic networks; multiplex networks – formulation of multiplex networks, weighted multiplex networks, unweighted multiplex networks, neighbors, paths, layers, centrality measures in multiplex networks, communities in multiplex networks; network resilience – measuring resilience, failure in networks, targeted attacks in networks; information diffusion – modeling information diffusion, epidemic models, influence models, explanatory models; recent trends in social network analysis.
DSC-838: Recommender Systems
Introduction to recommender systems; Collaborative recommendation; Content-based recommendation; Knowledge-based recommendation; Hybrid recommendation approaches; Evaluating recommender systems; Recommender systems and the next-generation web; Group Recommender Systems; Trust-Aware Recommender Systems; Social Recommender Systems; Time and Location Sensitive Recommender Systems, Cross-Domain Recommender Systems, Multi-Criteria Recommender Systems; Advanced topics in recommender systems
DSC-839: Information Cryptography
Introduction to Nondeterministic and deterministic cryptography, Computationally secure encryption, Asymptotic ciphers, Computational Indistinguishability, Symmetric multi-party computation, Searchable encryption, Secure multiparty computation, Oblivious data transfer, Shannon perfect security, Linear and Nonlinear feedback shift registers, Block ciphers structures, Design and analysis of block ciphers primitives, Design and analysis of stream cipher primitives, Design and analysis of collision resistant hashing algorithms, Differential and Linear Cryptanalysis, Attacks on block ciphers, Attacks on stream ciphers, Partitioning cryptanalysis, Brute-force attacks, Plaintext attacks, Man in middle attack, Chosen plaintext attack, Known key attack, Ciphertext-only attack, Non-deterministic and deterministic random bit generator, Statistical randomness evaluation, Key generation, Key distribution centre, Factorization of polynomials, Gröbner basis algorithms, Shor's basis algorithm, Digital signature standard, Merkle–Hellman knapsack cryptosystem, Public key exchange, Zero-Knowledge Cave, Certification authorities.
DSC-841: Metadata for Information Resources
Overview of the course and Metadata; History of schemes and metadata communities; Functions and Types of metadata; Metadata Structure and Characteristics: Semantics, syntax, and structure; Metadata creation process models; Interoperability; Metadata Integration and Architecture: Warwick Framework; Resource Description Framework; Open Archives Initiative; Encoding Standards (Markup Languages): Introduction and history of markup; Metadata use of markup languages; Document Type Definitions (DTD); Structural metadata Data Control Standards: Resource Identifiers; Data Registries; Controlled vocabularies; Name authority control (ISAAR and FRANAR); A-Core; Encoded Archival Description (EAD), Text Encoding Initiative (TEI); Metadata Evaluation: User needs; Quality control issues; Evaluation methods; Educational Metadata: Instructional Management Systems (IMS); Learning Object Metadata (LOM); Gateway to Educational Materials (GEM); Government Information Locator Service (GILS); Visual Resources Metadata: Categories for the Description of Works of Art (CDWA); Visual Resources Association (VRA) Core; Computer Interchange of Museum Information (CIMI)
DSC-845: Deep Learning
Review of Neural Networks, activation functions & back-propagation; multilayer neural nets, deep networks, learning algorithms, Convolutional Neural Networks: History, Convolution, Pooling, CNNs for classification, CNN Architectures; Sequence Modeling: Recurrent and Recursive Nets: Long-Short Term Memory models and variants, Language modeling and image captioning, Unsupervised learning: Restricted Boltzmann Machines and Auto-encoders; Case Studies.
DSC-846: Probabilistic Graphical Models
Probabilistic Graphical Models; Bayesian Networks; Directed Graphical Models; Undirected Graphical Models; Factor Graphs; Local Probabilistic Models; Probabilistic Inference; Exact Inference; Approximate Inference; Maximum Likelihood and Structural Learning; Learning Temporal Models; Introduction to Causality
DSC-84X: Advance Natural Language Processing
Introduction to NLP; Observations and Target Encoding; Tensors; Sequence Modeling; Constituency Grammars and Parsers; Dependency Parsing; Information Extraction; Machine Translation; Transformers; Natural Language Generation; Coreference Resolution; Integrating Knowledge in Language Models; Pretrained Language Models; Advanced topics in NLP
DSC-847: High Performance Computing
Introduction to High performance Computing (HPC); Parallel platforms; Interconnection Networks; Parallel programming patterns; Multiprocessor architectures; Cache coherence in symmetric multiprocessors; OpenMP; Parallel programming with MPI; Parallel algorithms
DSC-848: Big Data Analytics
Big Data Platforms; Big Data Storage, Tools, Algorithms, and Platforms (NoSQL, MapReduce, Hadoop, Apache Spark); Advanced Hashing Techniques; Bloom Filters; Real-Time Stream Analysis (algorithms, sampling, approximate statistics); Linked Big Data Analysis (Ontologies, SPARQL, Linked Open Data); Visualization Tools and Libraries (Tableau, D3.js); Graph Databases (Neo4J)
DSC-849: Data Governance
Data governance strategies; data governance processes; data governance tools; cloud-based data governance; data life-cyle; data quality; data protection; governing streaming data; data privacy; data security; data monitoring
DSC-861: Information Visualization and Presentation
Overview of the course and Information Visualization; Types of Graphs and Visualizations Data Types and Graph Types; Design Choices in Building Basic Graphs; Multidimensional Graphing; Graphing and Basic Statistics; Perceptual Properties; How to Critique Visual Designs Interactive Visualization; Multidimensional Interactive Visualization; Animation; Visualization Networks; Visualization for Search Interfaces and related Fields; Visualization for Text Analysis; 3D in Visualization; Research trends in Information Visualization
DSC-871: Digital Preservation
Introduction to Digital Preservation. Digital objects, and their preservation; Key issues such as obsolescence of storage media, software and data formats, hardware, and Digital Curation in Digital Libraries; their solution; Benefits of digital preservation such as Legal, Accountability & protection from litigation, Protecting the long term view, Protecting investment, Reuse; Reference Models for digital preservations; Role of Metadata and Registries; Preservation Methods, approaches, and their evaluation; Selection and appraisal methodologies; Digital Curation in Digital Libraries; Audit and Certification of Preservation Processes and Repositories.
DSC-881: Visual Analytics
Introduction and overview of the course; Analytical Reasoning and Critical Thinking; Mental and Visualization Models; Data: Representations, Transformations, and Statistics; Visual Representations; Interaction; Communication: Production, Presentation, Dissemination; Sense Making; Collaborative Visual Analytics; Evaluation of VAST systems.
DSC-882: Distributed Networks Analysis
An overview of Distributed systems; Distribution system analysis and optimization tools used for network planning; simulation, and prediction of system response; Data communication; classification Issues - mutual exclusion, election, deadlock, termination, data transfer, consistency, Architectural models, design goals, services Protocols and technologies Characteristics; interface; software File and directory structures; sharing; recovery; concurrency Security (access, authentication, encryption); Representative systems; System management and scheduling; Distributed file systems; Transaction management and consistency models; Distributed synchronization; Distributed system security; performance analysis of distributed data networks; Advanced queuing models and Basic queuing networks.
DSC-883: Network Performance Evaluation
Computer and communication network modeling; System performance evaluation; Data Presentation in networks; Network data/information Monitors; Statistical analysis of network data; Selection of Techniques and Metrics for system analysis; Workloads/data/information Management; Workload/data Characterization Techniques; Data transfer Planning and Benchmarking; Simulation Tools for network traffic analysis; Queuing systems (M/M/1, M/M/c/k, Advanced queuing models and Basic queuing networks); Experiments and Group work; Evaluate and design networks and protocols; Investigate network management tools and techniques; Stochastic processes and Markov Model.
DSC-885: Deep Learning on Graphs
Introduction to Graphs; Introduction to Deep Learning; Algorithms for Node Embeddings (random walk baed methods, deep learning based methods); Algorithms for Graph Embeddings; Graph Neural Networks (message passing framework, aggregation, deep learning); Graph Convolution Networks; Graph Attention Networks; Graph Autoencoders; Graph Transformers; Scaling GNNs; Applications of GNNs (node classification, link prediction, graph classification, community detection, etc.).
DSC-780: Thesis (6 credit hours)
Deficiency Courses for Student from Non-Computing Backgrounds