Information Retrieval Matthias Hagen
Contents I. Introduction
Objectives
Related Fields 1. Statistics
Literature Information Retrieval:
Literature Information Retrieval:
Literature “Core” information retrieval conferences:
Software Industry (Open Source):
Software Services:
Chapter IR:I I. Introduction
Examples of Retrieval Tasks Task: Learn everything there is to learn about information retrieval.
Examples of Retrieval Tasks Task: Learn everything there is to learn about information retrieval.
Remarks:
Examples of Retrieval Tasks Task: Plan a trip to Paris, France.
Examples of Retrieval Tasks Task: Plan a trip to Paris, France.
Remarks:
Examples of Retrieval Tasks Task: What were the news of the day?
Examples of Retrieval Tasks Task: What were the news of the day?
Examples of Retrieval Tasks Task: What were the news of the day?
Remarks:
Examples of Retrieval Tasks Task: Answer “Can Kangaroos jump higher than the Empire State Building?”
Examples of Retrieval Tasks Task: Answer “Can Kangaroos jump higher than the Empire State Building?”
Examples of Retrieval Tasks Task: Answer “Can Kangaroos jump higher than the Empire State Building?”
Remarks:
Examples of Retrieval Tasks Task: Build a fence.
Examples of Retrieval Tasks Task: Build a fence.
Remarks:
Examples of Retrieval Tasks Task: Write an essay on video surveillance.
Examples of Retrieval Tasks Task: Write an essay on video surveillance.
Remarks:
Examples of Retrieval Tasks Task: Given an example image, find more like it.
Examples of Retrieval Tasks Task: Given an example image, find more like it.
Examples of Retrieval Tasks Task: Given an example text, find more like it.
Examples of Retrieval Tasks Task: Given an example text, find more like it.
Remarks:
Examples of Retrieval Tasks Task: Find out what people commonly write in the phrase how to ? this.
Examples of Retrieval Tasks Task: Find out what people commonly write in the phrase how to ? this.
Remarks:
Terminology Information science distinguishes the concepts data, information, and knowledge.
Remarks:
Terminology Definition 4 (Information System)
Remarks:
Terminology
Terminology
Terminology
Terminology
Terminology
Terminology
Terminology
Terminology
Terminology
Terminology
Terminology Definition 7 (Information Retrieval)
Terminology Definition 7 (Information Retrieval)
Terminology Definition 7 (Information Retrieval)
Remarks:
Delineation Databases, Data Retrieval
Remarks:
Delineation Semiotics
Remarks:
Delineation Retrieval
Historical Background Manual Retrieval
Remarks:
Historical Background Manual Retrieval
Historical Background Manual Retrieval
Remarks:
Historical Background Mechanical Retrieval
Historical Background Mechanical Retrieval
Historical Background Computerized Retrieval
Historical Background Computerized Retrieval
Remarks:
Historical Background Information Retrieval (1950s)
Historical Background Information Retrieval (1960s)
Remarks:
Historical Background Information Retrieval (1970s)
Historical Background Information Retrieval (1980s - mid-1990s)
Historical Background Information Retrieval (mid-1990s - 2000s)
Web Search
Web Search
Web Search
Web Search
Chapter IR:II II. Architecture of a Search Engine
Remarks:
IR:II-84
?
Indexing Process
Acquisition conversion to
Indexing Process Acquisition
Indexing Process Acquisition: Crawler
Indexing Process Acquisition: Crawler
Remarks:
Indexing Process Acquisition: Converter
Indexing Process Acquisition: Document Store
Indexing Process Acquisition: Document Store
Acquisition conversion to
d
Indexing Process Text Transformation
Indexing Process Text Transformation
Indexing Process Text Transformation: Segmenter
Indexing Process Text Transformation: Segmenter
Indexing Process Text Transformation: Stopping
Indexing Process Text Transformation: Stemmer / Lemmatizer
Indexing Process Text Transformation: Stemmer / Lemmatizer
Indexing Process Text Transformation: Link Extraction
Indexing Process Text Transformation: Information Extraction
Indexing Process Text Transformation: Classification
d
conversion to plain text, and
Indexing Process Indexing
Indexing Process Indexing: Document Statistics
Indexing Process Indexing: Weighting
Indexing Process Indexing: Inversion
Indexing Process Indexing: Distribution
Indexing Process Indexing: Distribution
conversion to plain text, and
conversion to plain text, and
Search Process User Interaction
Search Process User Interaction
Search Process User Interaction
Search Process User Interaction: Query Language
Search Process User Interaction: Query Language
Search Process User Interaction: Query Transformation
conversion to plain text, and
conversion to plain text, and
Search Process Ranking
Search Process Ranking: Document Scoring
Search Process Ranking: Document Scoring
Search Process Ranking: Document Scoring
Search Process Ranking: Efficient Document Scoring
Search Process Ranking: Distribution
conversion to plain text, and
conversion to plain text, and
Search Process User Interaction: Results Output
conversion to plain text, and
conversion to plain text, and
Search Process User Interaction: Query Transformation
Search Process User Interaction: Query Transformation
conversion to plain text, and
t1 t2
Search Process Logging
t1 t2
t1 t2
Evaluation Overview
Remarks:
Architecture of a Search Engine
Architecture of a Search Engine (Karaoke Version)
Architecture of a Search Engine (Karaoke Version)
Architecture of a Search Engine (Karaoke Version)
Architecture of a Search Engine (Karaoke Version) Acquisition
Architecture of a Search Engine (Karaoke Version) d
Architecture of a Search Engine (Karaoke Version)
Architecture of a Search Engine (Karaoke Version)
Architecture of a Search Engine (Karaoke Version)
Architecture of a Search Engine (Karaoke Version)
Architecture of a Search Engine (Karaoke Version)
Architecture of a Search Engine (Karaoke Version)
Architecture of a Search Engine (Karaoke Version)
Chapter IR:III III. Text Transformation
Text Statistics Questions
Text Statistics Vocabulary Growth: Heaps’ Law
Text Statistics Vocabulary Growth: Heaps’ Law
Text Statistics Term Frequency: Zipf’s Law
Text Statistics Term Frequency: Zipf’s Law
Text Statistics Term Frequency: Zipf’s Law
Text Statistics Term Frequency: Zipf’s Law
Remarks:
Text Statistics Term Frequency: Zipf’s Law
Text Statistics Term Frequency: Zipf’s Law
Remarks:
Text Statistics Term Frequency: Zipf’s Law
Text Statistics Estimating Result Set Size
Remarks:
Text Statistics Estimating Result Set Size: Joint Probability
Text Statistics Estimating Result Set Size: Conditional Probability
Text Statistics Estimating Result Set Size: Initial Result Set-based Estimation
Text Statistics Estimating Result Set Size: Initial Result Set-based Estimation
Text Statistics Estimating Collection Size: Joint Probability-based
Text Statistics Estimating Collection Size: Proportionality
Text Statistics Estimating Collection Size: Proportionality
Remarks: 1.
22. 23.
Chapter IR:III III. Text Transformation
Parsing Documents Document Unit
Parsing Documents Index Term
Parsing Documents Document Structure and Markup
Parsing Documents Document Structure and Markup
Parsing Documents Document Structure and Markup
Parsing Documents Tokenization
Remarks:
Parsing Documents Tokenization Problems
Parsing Documents Tokenization Problems
Parsing Documents Stopping (Token Removal)
Parsing Documents Token Normalization
Parsing Documents Stemming
Parsing Documents Stemming: Principles
Parsing Documents Stemming: Affix Elimination
Parsing Documents Stemming: Affix Elimination
Parsing Documents Stemming: Porter Stemmer
Parsing Documents Stemming: Porter Stemmer
Parsing Documents Stemming: Porter Stemmer
Parsing Documents Stemming: Porter Stemmer
Parsing Documents Stemming: Porter Stemmer
Parsing Documents Stemming: Krovetz Stemmer
Parsing Documents Stemming: Stemmer Comparison
Parsing Documents Phrases (Multi-Token Index Terms)
Remarks:
Parsing Documents Phrases: N-Grams
Parsing Documents Phrases: N-Grams
Remarks:
IR:III-53
IR:III-54
IR:III-55
IR:III-56
Chapter IR:III III. Text Transformation
Information Extraction Overview
Information Extraction Overview
Information Extraction Part-of-Speech Tagging
Information Extraction Part-of-Speech Tagging: Example
Information Extraction Part-of-Speech Tagging: Brill Tagger
Information Extraction Part-of-Speech Tagging: Brill Tagger
Information Extraction Part-of-Speech Tagging: Brill Tagger
Remarks:
Information Extraction Noun Phrase
Information Extraction Noun Phrase
Information Extraction Noun Phrase Extraction
Information Extraction Named Entity
Information Extraction Named Entity
Information Extraction Named Entity Recognition
Information Extraction Named Entity Recognition: Hidden Markov Models (informal)
Information Extraction Named Entity Recognition: Hidden Markov Models (informal)
Information Extraction Named Entity Recognition: Hidden Markov Models (informal)
Chapter IR:IV IV. Indexes
Inverted Indexes Index
Inverted Indexes Index
Inverted Indexes Index
Remarks:
Inverted Indexes Data Structure
Inverted Indexes Data Structure
Inverted Indexes Data Structure
Inverted Indexes Data Structure
Remarks:
Inverted Index Postings
Inverted Index Postings
Inverted Index Postings
Inverted Index Postings Lists
Remarks:
Chapter IR:IV IV. Indexes
Query Processing Query Types
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Query Processing Postlist Intersection
Remarks:
Query Processing Postlist Intersection
Query Processing Positional Indexing
Query Processing Positional Indexing
Query Processing Positional Indexing
Query Processing Positional Indexing
Query Processing Positional Indexing
Remarks:
Query Processing Document Scoring
Query Processing Document Scoring
Remarks:
Query Processing Document Scoring
Remarks:
Query Processing Document Scoring
Query Processing Document Scoring
Query Processing Document Scoring
Query Processing Top-k Retrieval
Query Processing Index Distribution
Query Processing Caching
Chapter IR:IV IV. Indexes
Index Construction Inversion and Indexing
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Index Construction Index Merging
Remarks:
Index Construction Distributed Indexing
Remarks:
Index Construction Distributed Indexing
Index Construction Distributed Indexing: Example Netspeak
Index Construction Index Updates
Chapter IR:IV IV. Indexes
Compression Size Issues
Compression Savings
Compression Basic Idea
Compression Delta Encoding
Compression Delta Encoding
Compression Bit-Aligned Codes
Compression Unary and Binary Codes
Compression Elias-γ Code
Compression Elias-δ Code
Compression Elias-δ Code
Compression Byte-Aligned Codes
Compression V-Byte Encoding
Compression Example
Chapter IR:V V. Retrieval Models
Overview of Retrieval Models Document Views
Overview of Retrieval Models Retrieval Models
Overview of Retrieval Models Retrieval Models
Overview of Retrieval Models Retrieval Models
Overview of Retrieval Models Retrieval Models
Overview of Retrieval Models Definition 1 (Retrieval Model, Relevance Function)
Overview of Retrieval Models Definition 1 (Retrieval Model, Relevance Function)
Remarks:
Overview of Retrieval Models History of Retrieval Models
Chapter IR:V V. Retrieval Models
Empirical Models
Boolean Retrieval Retrieval Model R = hD, Q, ρi
Boolean Retrieval Retrieval Model R = hD, Q, ρi
Boolean Retrieval Relevance Function ρ
Boolean Retrieval Relevance Function ρ
Boolean Retrieval Example
Remarks:
Boolean Retrieval Query Refinement: “Searching by Numbers”
Boolean Retrieval Query Refinement: “Searching by Numbers”
Boolean Retrieval Discussion
Vector Space Model Retrieval Model R = hD, Q, ρi
Vector Space Model Retrieval Model R = hD, Q, ρi
Vector Space Model Relevance Function ρ: Cosine Similarity
Vector Space Model Relevance Function ρ: Cosine Similarity
Vector Space Model Relevance Function ρ: Cosine Similarity
Vector Space Model Example
Vector Space Model Term Weighting: tf ·idf
Vector Space Model Term Weighting: tf ·idf
Vector Space Model Term Weighting: tf ·idf
Remarks:
Vector Space Model Term Weighting: tf ·idf
Vector Space Model Query Refinement: Relevance Feedback
Vector Space Model Query Refinement: Relevance Feedback
Vector Space Model Discussion
Chapter IR:V V. Retrieval Models
Probabilistic Models
Probabilistic Models Probability Ranking Principle
Remarks:
Binary Independence Model Retrieval Model R = hD, Q, ρi
Binary Independence Model Retrieval Model R = hD, Q, ρi
Remarks:
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Derivation
Binary Independence Model Relevance Function ρ: Estimation
Binary Independence Model Relevance Function ρ: Estimation
Remarks:
Binary Independence Model Relevance Function ρ: Example
Binary Independence Model Relevance Function ρ: Example
Binary Independence Model Relevance Function ρ: Example
Binary Independence Model Relevance Function ρ: Example
Binary Independence Model Relevance Function ρ: Example
Binary Independence Model Relevance Function ρ: Example
Binary Independence Model Relevance Function ρ: Example
Binary Independence Model Relevance Function ρ: Summary
Remarks:
Binary Independence Model Query Refinement: Relevance Feedback
Binary Independence Model Query Refinement: Relevance Feedback Example
Binary Independence Model Query Refinement: Relevance Feedback
Binary Independence Model Query Refinement: Relevance Feedback
Remarks:
Binary Independence Model Discussion
Okapi BM25 Retrieval Model R = hD, Q, ρi
Okapi BM25 Background
Remarks:
Okapi BM25 Term Weighting
Okapi BM25 Term Weighting
Okapi BM25 Term Weighting
Okapi BM25 Term Weighting
Okapi BM25 Term Weighting
Okapi BM25 Term Weighting
Okapi BM25 Term Weighting
Okapi BM25 Term Weighting
Okapi BM25 Term Weighting
Okapi BM25 Discussion
Chapter IR:V V. Retrieval Models
Hidden Variable Models Obviously, the terms found in a document d ∈ D are somehow related to the
Hidden Variable Models
Hidden Variable Models Term-Document Matrix
Hidden Variable Models Term-Document Matrix
Hidden Variable Models Term-Document Matrix
Hidden Variable Models Term-Document Matrix
Hidden Variable Models Term-Document Matrix
Latent Semantic Indexing Singular Value Decomposition
Latent Semantic Indexing Singular Value Decomposition
Latent Semantic Indexing Singular Value Decomposition
Latent Semantic Indexing Singular Value Decomposition
Latent Semantic Indexing Singular Value Decomposition
Latent Semantic Indexing Singular Value Decomposition
Latent Semantic Indexing Singular Value Decomposition
Latent Semantic Indexing Singular Value Decomposition
Remarks:
Latent Semantic Indexing Retrieval Model R = hD, Q, ρi
Latent Semantic Indexing Retrieval Model R = hD, Q, ρi
Latent Semantic Indexing Example
Latent Semantic Indexing Example
Remarks:
Latent Semantic Indexing Example: Term-Document Matrix A
Latent Semantic Indexing Example: Singular Value Decomposition A = U S V T
Latent Semantic Indexing Example: Singular Value Decomposition A = U S V T
Latent Semantic Indexing Example: Singular Value Decomposition A = U S V T
Latent Semantic Indexing Example: Dimensionality Reduction Ak = Uk Sk VkT
Latent Semantic Indexing Example: Dimensionality Reduction Ak = Uk Sk VkT
Latent Semantic Indexing Example: Dimensionality Reduction Ak = Uk Sk VkT
Latent Semantic Indexing Example: Retrieval in Concept Space
Latent Semantic Indexing Retrieval Model R = hD, Q, ρi
Latent Semantic Indexing Retrieval Model R = hD, Q, ρi
Latent Semantic Indexing Example 2
Latent Semantic Indexing Example 2
Latent Semantic Indexing Example 2
Remarks:
Latent Semantic Indexing Example 2: Document Similarity Matrix AT A
Latent Semantic Indexing Example 2: Document Similarity Matrix AT A
Latent Semantic Indexing Example 2: Document Similarity Matrix AT A
Latent Semantic Indexing Example 2: Term Similarity Matrix AAT
Latent Semantic Indexing Example 2: Term Similarity Matrix AAT
Latent Semantic Indexing Example 2: Term Similarity Matrix AAT
Latent Semantic Indexing Discussion
Explicit Semantic Analysis Concept Hypothesis
Explicit Semantic Analysis Retrieval Model R = hD, Q, ρi
Explicit Semantic Analysis Retrieval Model R = hD, Q, ρi
Explicit Semantic Analysis Document Representation
Explicit Semantic Analysis Relevance Function ρ
Explicit Semantic Analysis Relevance Function ρ
Explicit Semantic Analysis Discussion
Chapter IR:V V. Retrieval Models
Generative Models
Language Models Background
Language Models Basics: Grammar
Language Models Basics: Grammar
Remarks [Kastens 2005] :
Language Models Basics: Grammar
Remarks:
Language Models Basics: Grammar
Language Models Basics: Grammar
Remarks:
Language Models Basics: Chomsky Hierarchy
Language Models Basics: Chomsky Hierarchy
Language Models Basics: Chomsky Hierarchy
Language Models Basics: Chomsky Hierarchy
Remarks:
Language Models Basics: Calculi
Language Models Basics: Calculi
Language Models Example: Deterministic Language Model
Language Models Example: Deterministic Language Model
Language Models Example: Deterministic Language Model
Language Models Example: Deterministic Language Model
Remarks:
Language Models Example: Statistical Language Model
Language Models Example: Statistical Language Model
Remarks:
Language Models Retrieval Model R = hD, Q, ρi
Language Models Retrieval Model R = hD, Q, ρi
Language Models Relevance Function ρ: Derivation
Language Models Relevance Function ρ: Derivation
Language Models Relevance Function ρ: Derivation
Language Models Relevance Function ρ: Derivation
Language Models Relevance Function ρ: Derivation
Language Models Relevance Function ρ: Derivation
Language Models Relevance Function ρ: Derivation
Language Models Relevance Function ρ: Derivation
Language Models Relevance Function ρ: Estimation
Language Models Relevance Function ρ: Estimation
Language Models Relevance Function ρ: Estimation
Remarks:
Language Models Relevance Function ρ: Estimation
Language Models Relevance Function ρ: Estimation
Language Models Relevance Function ρ: Estimation
Remarks:
Language Models Relevance Function ρ: Example
Language Models Relevance Function ρ: Example
Language Models Relevance Function ρ: Summary
Language Models Query Refinement: Relevance Feedback
Language Models Query Refinement: Relevance Feedback
Language Models Query Refinement: Relevance Feedback
Language Models Query Refinement: Relevance Feedback
Remarks:
Language Models Query Refinement: Relevance Feedback
Language Models Query Refinement: Relevance Feedback
Language Models Query Refinement: Relevance Feedback
Language Models Query Refinement: Relevance Feedback
Language Models Query Refinement: Relevance Feedback
Language Models Query Refinement: Relevance Feedback
Language Models Discussion
Chapter IR:V V. Retrieval Models
Combining Evidence
Combining Evidence Bayesian Networks
Combining Evidence Inference Network
Combining Evidence Inference Network
Combining Evidence Inference Network
Combining Evidence Inference Network
Combining Evidence Inference Network
Combining Evidence Example: AND Combination
Combining Evidence Example: AND Combination
Combining Evidence Inference Network Operators
Combining Evidence Query Language Example
Chapter IR:V V. Retrieval Models
Web Search
Web Search Search Taxonomy
Web Search
Web Search Search Engine Optimization
Web Search
Web Search Term Proximity
Web Search Query types
Chapter IR:V V. Retrieval Models
Learning to Rank Machine Learning and IR
Learning to Rank Generative vs. Discriminative
Learning to Rank Discriminative Models for IR
Chapter IR:VI VI. Users and Queries
Information Needs and Queries Information Needs
Information Needs and Queries Queries
Information Needs and Queries Interaction
Information Needs and Queries ASK Hypothesis
Information Needs and Queries Keyword Queries
Information Needs and Queries Keyword Queries
Content I. Introduction
Query Transformation and Refinement Query transformation
Query Transformation and Refinement Query-Based Stopping
Query Transformation and Refinement Query-Based Stemming
Query Transformation and Refinement Stem Classes
Query Transformation and Refinement Stem Classes
Query Transformation and Refinement Modifying Stem Classes
Query Transformation and Refinement Modifying Stem Classes
Query Transformation and Refinement Spell Checking
Query Transformation and Refinement Spell Checking
Query Transformation and Refinement Spell Checking
Query Transformation and Refinement Edit Distance
Query Transformation and Refinement Edit Distance
Query Transformation and Refinement Soundex Code
Query Transformation and Refinement Spelling Correction Issues
Query Transformation and Refinement Noisy Channel Model
Query Transformation and Refinement Noisy Channel Model
Query Transformation and Refinement Noisy Channel Model
Query Transformation and Refinement Noisy Channel Model
Query Transformation and Refinement Query Expansion
Query Transformation and Refinement Query Expansion
Query Transformation and Refinement Term Association Measures
Query Transformation and Refinement Term Association Measures
Query Transformation and Refinement Term Association Measures
Query Transformation and Refinement Term Association Measures
Query Transformation and Refinement Term Association Measures
Query Transformation and Refinement Association Measure Example
Query Transformation and Refinement Association Measure Example
Query Transformation and Refinement Association Measure Example
Query Transformation and Refinement Association Measures
Query Transformation and Refinement Other Query Expansion Approaches
Query Transformation and Refinement Other Query Expansion Approaches
Query Transformation and Refinement Relevance Feedback
Query Transformation and Refinement Relevance Feedback Example
Query Transformation and Refinement Relevance Feedback Example
Query Transformation and Refinement Relevance Feedback Example
Query Transformation and Refinement Relevance Feedback
Query Transformation and Refinement Context and Personalization
Query Transformation and Refinement User Models
Query Transformation and Refinement Query Logs
Query Transformation and Refinement Local Search
Query Transformation and Refinement Local Search
Query Transformation and Refinement Extracting Location Information
Chapter IR:VI VI. Users and Queries
Cross-Language Search Goals
Cross-Language Search Basic Approach
Cross-Language Search Translation
Content I. Introduction
Showing the Results Snippet Generation
Showing the Results Sentence Selection
Showing the Results Sentence Selection
Showing the Results Snippet Generation
Showing the Results Snippet Generation
Showing the Results Snippet Guidelines
Showing the Results Advertising
Showing the Results Searching Advertisements
Showing the Results Searching Advertisements
Showing the Results Example Advertisements
Showing the Results Clustering Results
Showing the Results Clustering Results – Requirements
Showing the Results Types of Classification
Showing the Results Classification Example
Showing the Results Result Clusters
Showing the Results Faceted Classification
Showing the Results Example Faceted Classification
Showing the Results Example Faceted Classification
Chapter IR:VIII VIII. Evaluation
Laboratory Experiments Retrieval Tasks
Laboratory Experiments Experimental Setup
Remarks:
Laboratory Experiments Public Resources
Remarks:
Laboratory Experiments Topic Descriptions
Remarks:
Laboratory Experiments Relevance Judgments
Laboratory Experiments Relevance Judgments
Laboratory Experiments Relevance Judgments
Laboratory Experiments Relevance Judgments
Remarks:
Evaluation Corpus Assessment Sampling Strategy: Pooling
Evaluation Corpus Measuring Annotator Agreement
Evaluation Corpus Measuring Annotator Agreement: Kappa Statistics
Evaluation Corpus Measuring Annotator Agreement: Kappa Statistics
Evaluation Corpus Measuring Annotator Agreement: Kappa Statistics
Remarks:
Chapter IR:VIII VIII. Evaluation
Logging Query Logs
Logging Query Logs
Logging Example Click Policy
Logging Query Logs
Chapter IR:VIII VIII. Evaluation
Effectiveness Metrics Precision and Recall
Remarks:
Effectiveness Metrics Precision and Recall
Effectiveness Metrics Precision and Recall
Remarks:
Effectiveness Metrics Precision and Recall: Illustration
Effectiveness Metrics Precision and Recall: Illustration
Effectiveness Metrics Precision and Recall: Illustration
Effectiveness Metrics Precision and Recall: Recall Estimation
Effectiveness Metrics Ranking Effectiveness
Effectiveness Metrics Ranking Effectiveness: Precision@k and Recall@k
Effectiveness Metrics Ranking Effectiveness: Precision@k and Recall@k
Effectiveness Metrics Ranking Effectiveness: Average Precision
Effectiveness Metrics Ranking Effectiveness: Mean Average Precision (MAP)
Effectiveness Metrics Ranking Effectiveness: Mean Average Precision (MAP)
Effectiveness Metrics Ranking Effectiveness: Mean Average Precision (MAP)
Effectiveness Metrics Precision-Recall Graph
Effectiveness Metrics Precision-Recall Graph
Effectiveness Metrics Precision-Recall Graph
Effectiveness Metrics Precision-Recall Graph
Effectiveness Metrics Precision-Recall Graph
Effectiveness Metrics Ranking Effectiveness: Mean Reciprocal Rank (MRR)
Effectiveness Metrics Ranking Effectiveness: Discounted Cumulative Gain (DCG)
Effectiveness Metrics Ranking Effectiveness: Normalized Discounted Cumulative Gain (NDCG)
Chapter IR:VIII VIII. Evaluation
Efficiency Metrics
Efficiency Metrics Query throughput
Chapter IR:VIII VIII. Evaluation
Training and Testing Significance Tests
Training and Testing Significance Tests
Training and Testing One-Sided Test
Training and Testing Example Experimental Results
Training and Testing t-Test
Training and Testing Wilcoxon Signed-Ranks Test
Training and Testing Sign Test
Training and Testing Setting Parameter Values
Training and Testing Finding Parameter Values
Training and Testing Online Testing
Chapter IR:IX IX. Acquisition
Crawling the Web Web Technology
Crawling the Web Web Technology: Internet
Crawling the Web Web Technology: Internet
Crawling the Web Web Technology: Internet
[Internet Systems Consortium, www.isc.org]
Crawling the Web Web Technology: World Wide Web
Crawling the Web Web Technology: World Wide Web
Crawling the Web Web Technology: World Wide Web
Crawling the Web Web Technology: Addressing
Crawling the Web Web Technology: Addressing
Crawling the Web Web Technology: Addressing
Crawling the Web Web Technology: Hypertext Transfer Protocol (HTTP)
Crawling the Web Web Technology: Hypertext Transfer Protocol (HTTP)
Crawling the Web Web Technology: Hypertext Transfer Protocol (HTTP)
Crawling the Web Web Technology: Hypertext Markup Language (HTML)
Crawling the Web The Web Graph
Crawling the Web The Web Graph
Crawling the Web The Web Graph
Crawling the Web Crawling Hypertext
Crawling the Web Requirements
Crawling the Web Selectivity
Remarks:
Crawling the Web Selectivity: Malicious Pages (Black-hat SEO∗, Spam)
Crawling the Web Politeness
Crawling the Web Politeness: Robots Exclusion Protocol (robots.txt)
Remarks:
Crawling the Web Politeness: Crawling Algorithm Revisited
Crawling the Web Freshness
Crawling the Web Freshness: Metric
Crawling the Web Freshness: Metric
Crawling the Web Freshness: Age Metric
Crawling the Web Freshness: Age Metric
Crawling the Web Freshness: Age Metric
Crawling the Web Freshness: Age Metric
Crawling the Web Freshness: Age Metric
Crawling the Web Efficiency and Scalability
Remarks:
Crawling the Web Extensibility
Chapter IR:IX IX. Acquisition
Conversion File formats
Conversion Character encoding
Conversion Character encoding: Documents lie!
Conversion Character encoding: Even more problems
Conversion Unicode
Conversion Unicode: UTF-8
Kapitel IR:IX IX. Acquisition
Storing Documents Creating the document store
Storing Documents Requirements for document storage systems
Storing Documents Large files
Storing Documents Example file in TREC Web compound document format
Storing Documents Compression
Storing Documents BigTable: Google’s document storage system
Storing Documents BigTable
Storing Documents BigTable
Storing Documents Detecting Duplicates
Storing Documents Detecting Duplicates
Storing Documents Near-Duplicate Detection
Storing Documents Near-Duplicate Detection
Storing Documents Near-Duplicate Detection
Storing Documents Fingerprinting example
Storing Documents Fingerprinting
Storing Documents Simhash
Storing Documents Simhash example
Storing Documents Removing noise
Storing Documents First idea to find content blocks: document slope curve
Storing Documents Other ideas to find content blocks