stage academic conversation

这个提交包含在:
binary-husky
2025-06-22 18:31:41 +08:00
父节点 8c21432291
当前提交 73f573092b
共有 45 个文件被更改,包括 9992 次插入17 次删除

查看文件

@@ -0,0 +1,76 @@
# ADS query optimization prompt
ADSABS_QUERY_PROMPT = """Analyze and optimize the following query for NASA ADS search.
If the query is not related to astronomy, astrophysics, or physics, return <query>none</query>.
If the query contains non-English terms, translate them to English first.
Query: {query}
Task: Transform the natural language query into an optimized ADS search query.
Always generate English search terms regardless of the input language.
IMPORTANT: Ignore any requirements about journal ranking (CAS, JCR, IF index),
or output format requirements. Focus only on the core research topic for the search query.
Relevant research areas for ADS:
- Astronomy and astrophysics
- Physics (theoretical and experimental)
- Space science and exploration
- Planetary science
- Cosmology
- Astrobiology
- Related instrumentation and methods
Available search fields and filters:
1. Basic fields:
- title: Search in title (title:"term")
- abstract: Search in abstract (abstract:"term")
- author: Search for author names (author:"lastname, firstname")
- year: Filter by year (year:2020-2023)
- bibstem: Search by journal abbreviation (bibstem:ApJ)
2. Boolean operators:
- AND
- OR
- NOT
- (): Group terms
- "": Exact phrase match
3. Special filters:
- citations(identifier:paper): Papers citing a specific paper
- references(identifier:paper): References of a specific paper
- citation_count: Filter by citation count
- database: Filter by database (database:astronomy)
Examples:
1. Query: "Black holes in galaxy centers after 2020"
<query>title:"black hole" AND abstract:"galaxy center" AND year:2020-</query>
2. Query: "Papers by Neil deGrasse Tyson about exoplanets"
<query>author:"Tyson, Neil deGrasse" AND title:exoplanet</query>
3. Query: "Most cited papers about dark matter in ApJ"
<query>title:"dark matter" AND bibstem:ApJ AND citation_count:[100 TO *]</query>
4. Query: "Latest research on diabetes treatment"
<query>none</query>
5. Query: "Machine learning for galaxy classification"
<query>title:("machine learning" OR "deep learning") AND (title:galaxy OR abstract:galaxy) AND abstract:classification</query>
Please analyze the query and respond ONLY with XML tags:
<query>Provide the optimized ADS search query using appropriate fields and operators, or "none" if not relevant</query>"""
# System prompt
ADSABS_QUERY_SYSTEM_PROMPT = """You are an expert at crafting NASA ADS search queries.
Your task is to:
1. First determine if the query is relevant to astronomy, astrophysics, or physics research
2. If relevant, optimize the natural language query for the ADS API
3. If not relevant, return "none" to indicate the query should be handled by other databases
Focus on creating precise queries that will return relevant astronomical and physics literature.
Always generate English search terms regardless of the input language.
Consider using field-specific search terms and appropriate filters to improve search accuracy.
Remember: ADS is specifically for astronomy, astrophysics, and physics research.
Medical, biological, or general research queries should return "none"."""

查看文件

@@ -0,0 +1,341 @@
# Basic type analysis prompt
ARXIV_TYPE_PROMPT = """Analyze the research query and determine if arXiv search is needed and its type.
Query: {query}
Task 1: Determine if this query requires arXiv search
- arXiv is suitable for:
* Computer science and AI/ML
* Physics and mathematics
* Quantitative biology and finance
* Electrical engineering
* Recent preprints in these fields
- arXiv is NOT needed for:
* Medical research (unless ML/AI applications)
* Social sciences
* Business studies
* Humanities
* Industry reports
Task 2: If arXiv search is needed, determine the most appropriate search type
Available types:
1. basic: Keyword-based search across all fields
- For specific technical queries
- When looking for particular methods or applications
2. category: Category-based search within specific fields
- For broad topic exploration
- When surveying a research area
3. none: arXiv search not needed for this query
- When topic is outside arXiv's scope
- For non-technical or clinical research
Examples:
1. Query: "BERT transformer architecture"
<search_type>basic</search_type>
2. Query: "latest developments in machine learning"
<search_type>category</search_type>
3. Query: "COVID-19 clinical trials"
<search_type>none</search_type>
4. Query: "psychological effects of social media"
<search_type>none</search_type>
Please analyze the query and respond ONLY with XML tags:
<search_type>Choose either 'basic', 'category', or 'none'</search_type>"""
# Query optimization prompt
ARXIV_QUERY_PROMPT = """Optimize the following query for arXiv search.
Query: {query}
Task: Transform the natural language query into an optimized arXiv search query using boolean operators and field tags.
Always generate English search terms regardless of the input language.
IMPORTANT: Ignore any requirements about journal ranking (CAS, JCR, IF index),
or output format requirements. Focus only on the core research topic for the search query.
Available field tags:
- ti: Search in title
- abs: Search in abstract
- au: Search for author
- all: Search in all fields (default)
Boolean operators:
- AND: Both terms must appear
- OR: Either term can appear
- NOT: Exclude terms
- (): Group terms
- "": Exact phrase match
Examples:
1. Natural query: "Recent papers about transformer models by Vaswani"
<query>ti:"transformer model" AND au:Vaswani AND year:[2017 TO 2024]</query>
2. Natural query: "Deep learning for computer vision, excluding surveys"
<query>ti:(deep learning AND "computer vision") NOT (ti:survey OR ti:review)</query>
3. Natural query: "Attention mechanism in language models"
<query>ti:(attention OR "attention mechanism") AND abs:"language model"</query>
4. Natural query: "GANs or generative adversarial networks for image generation"
<query>(ti:GAN OR ti:"generative adversarial network") AND abs:"image generation"</query>
Please analyze the query and respond ONLY with XML tags:
<query>Provide the optimized search query using appropriate operators and tags</query>
Note:
- Use quotes for exact phrases
- Combine multiple conditions with boolean operators
- Consider both title and abstract for important concepts
- Include author names when relevant
- Use parentheses for complex logical groupings"""
# Sort parameters prompt
ARXIV_SORT_PROMPT = """Determine optimal sorting parameters for the research query.
Query: {query}
Task: Select the most appropriate sorting parameters to help users find the most relevant papers.
Available sorting options:
1. Sort by:
- relevance: Best match to query terms (default)
- lastUpdatedDate: Most recently updated papers
- submittedDate: Most recently submitted papers
2. Sort order:
- descending: Newest/Most relevant first (default)
- ascending: Oldest/Least relevant first
3. Result limit:
- Minimum: 10 papers
- Maximum: 50 papers
- Recommended: 20-30 papers for most queries
Examples:
1. Query: "Latest developments in transformer models"
<sort_by>submittedDate</sort_by>
<sort_order>descending</sort_order>
<limit>30</limit>
2. Query: "Foundational papers about neural networks"
<sort_by>relevance</sort_by>
<sort_order>descending</sort_order>
<limit>20</limit>
3. Query: "Evolution of deep learning since 2012"
<sort_by>submittedDate</sort_by>
<sort_order>ascending</sort_order>
<limit>50</limit>
Please analyze the query and respond ONLY with XML tags:
<sort_by>Choose: relevance, lastUpdatedDate, or submittedDate</sort_by>
<sort_order>Choose: ascending or descending</sort_order>
<limit>Suggest number between 10-50</limit>
Note:
- Choose relevance for specific technical queries
- Use lastUpdatedDate for tracking paper revisions
- Use submittedDate for following recent developments
- Consider query context when setting the limit"""
# System prompts for each task
ARXIV_TYPE_SYSTEM_PROMPT = """You are an expert at analyzing academic queries.
Your task is to determine whether the query is better suited for keyword search or category-based search.
Consider the query's specificity, scope, and intended search area when making your decision.
Always respond in English regardless of the input language."""
ARXIV_QUERY_SYSTEM_PROMPT = """You are an expert at crafting arXiv search queries.
Your task is to optimize natural language queries using boolean operators and field tags.
Focus on creating precise, targeted queries that will return the most relevant results.
Always generate English search terms regardless of the input language."""
ARXIV_CATEGORIES_SYSTEM_PROMPT = """You are an expert at arXiv category classification.
Your task is to select the most relevant categories for the given research query.
Consider both primary and related interdisciplinary categories, while maintaining focus on the main research area.
Always respond in English regardless of the input language."""
ARXIV_SORT_SYSTEM_PROMPT = """You are an expert at optimizing search results.
Your task is to determine the best sorting parameters based on the query context.
Consider the user's likely intent and temporal aspects of the research topic.
Always respond in English regardless of the input language."""
# 添加新的搜索提示词
ARXIV_SEARCH_PROMPT = """Analyze and optimize the research query for arXiv search.
Query: {query}
Task: Transform the natural language query into an optimized arXiv search query.
Available search options:
1. Basic search with field tags:
- ti: Search in title
- abs: Search in abstract
- au: Search for author
Example: "ti:transformer AND abs:attention"
2. Category-based search:
- Use specific arXiv categories
Example: "cat:cs.AI AND neural networks"
3. Date range:
- Specify date range using submittedDate
Example: "deep learning AND submittedDate:[20200101 TO 20231231]"
Examples:
1. Query: "Recent papers about transformer models by Vaswani"
<search_criteria>
<query>ti:"transformer model" AND au:Vaswani AND submittedDate:[20170101 TO 99991231]</query>
<categories>cs.CL, cs.AI, cs.LG</categories>
<sort_by>submittedDate</sort_by>
<sort_order>descending</sort_order>
<limit>30</limit>
</search_criteria>
2. Query: "Latest developments in computer vision"
<search_criteria>
<query>cat:cs.CV AND submittedDate:[20220101 TO 99991231]</query>
<categories>cs.CV, cs.AI, cs.LG</categories>
<sort_by>submittedDate</sort_by>
<sort_order>descending</sort_order>
<limit>25</limit>
</search_criteria>
Please analyze the query and respond with XML tags containing search criteria."""
ARXIV_SEARCH_SYSTEM_PROMPT = """You are an expert at crafting arXiv search queries.
Your task is to analyze research queries and transform them into optimized arXiv search criteria.
Consider query intent, relevant categories, and temporal aspects when creating the search parameters.
Always generate English search terms and respond in English regardless of the input language."""
# Categories selection prompt
ARXIV_CATEGORIES_PROMPT = """Select the most relevant arXiv categories for the research query.
Query: {query}
Task: Choose 2-4 most relevant categories that best match the research topic.
Available Categories:
Computer Science (cs):
- cs.AI: Artificial Intelligence (neural networks, machine learning, NLP)
- cs.CL: Computation and Language (NLP, machine translation)
- cs.CV: Computer Vision and Pattern Recognition
- cs.LG: Machine Learning (deep learning, reinforcement learning)
- cs.NE: Neural and Evolutionary Computing
- cs.RO: Robotics
- cs.IR: Information Retrieval
- cs.SE: Software Engineering
- cs.DB: Databases
- cs.DC: Distributed Computing
- cs.CY: Computers and Society
- cs.HC: Human-Computer Interaction
Mathematics (math):
- math.OC: Optimization and Control
- math.PR: Probability
- math.ST: Statistics
- math.NA: Numerical Analysis
- math.DS: Dynamical Systems
Statistics (stat):
- stat.ML: Machine Learning
- stat.ME: Methodology
- stat.TH: Theory
- stat.AP: Applications
Physics (physics):
- physics.comp-ph: Computational Physics
- physics.data-an: Data Analysis
- physics.soc-ph: Physics and Society
Electrical Engineering (eess):
- eess.SP: Signal Processing
- eess.AS: Audio and Speech Processing
- eess.IV: Image and Video Processing
- eess.SY: Systems and Control
Examples:
1. Query: "Deep learning for computer vision"
<categories>cs.CV, cs.LG, stat.ML</categories>
2. Query: "Natural language processing with transformers"
<categories>cs.CL, cs.AI, cs.LG</categories>
3. Query: "Reinforcement learning for robotics"
<categories>cs.RO, cs.AI, cs.LG</categories>
4. Query: "Statistical methods in machine learning"
<categories>stat.ML, cs.LG, math.ST</categories>
Please analyze the query and respond ONLY with XML tags:
<categories>List 2-4 most relevant categories, comma-separated</categories>
Note:
- Choose primary categories first, then add related ones
- Limit to 2-4 most relevant categories
- Order by relevance (most relevant first)
- Use comma and space between categories (e.g., "cs.AI, cs.LG")"""
# 在文件末尾添加新的 prompt
ARXIV_LATEST_PROMPT = """Determine if the query is requesting latest papers from arXiv.
Query: {query}
Task: Analyze if the query is specifically asking for recent/latest papers from arXiv.
IMPORTANT RULE:
- The query MUST explicitly mention "arXiv" or "arxiv" to be considered a latest arXiv papers request
- Queries only asking for recent/latest papers WITHOUT mentioning arXiv should return false
Indicators for latest papers request:
1. MUST HAVE keywords about arXiv:
- "arxiv"
- "arXiv"
AND
2. Keywords about recency:
- "latest"
- "recent"
- "new"
- "newest"
- "just published"
- "this week/month"
Examples:
1. Latest papers request (Valid):
Query: "Show me the latest AI papers on arXiv"
<is_latest_request>true</is_latest_request>
2. Latest papers request (Valid):
Query: "What are the recent papers about transformers on arxiv"
<is_latest_request>true</is_latest_request>
3. Not a latest papers request (Invalid - no mention of arXiv):
Query: "Show me the latest papers about BERT"
<is_latest_request>false</is_latest_request>
4. Not a latest papers request (Invalid - no recency):
Query: "Find papers on arxiv about transformers"
<is_latest_request>false</is_latest_request>
Please analyze the query and respond ONLY with XML tags:
<is_latest_request>true/false</is_latest_request>
Note: The response should be true ONLY if both conditions are met:
1. Query explicitly mentions arXiv/arxiv
2. Query asks for recent/latest papers"""
ARXIV_LATEST_SYSTEM_PROMPT = """You are an expert at analyzing academic queries.
Your task is to determine if the query is specifically requesting latest/recent papers from arXiv.
Remember: The query MUST explicitly mention arXiv to be considered valid, even if it asks for recent papers.
Always respond in English regardless of the input language."""

查看文件

@@ -0,0 +1,55 @@
# Crossref query optimization prompt
CROSSREF_QUERY_PROMPT = """Analyze and optimize the query for Crossref search.
Query: {query}
Task: Transform the natural language query into an optimized Crossref search query.
Always generate English search terms regardless of the input language.
IMPORTANT: Ignore any requirements about journal ranking (CAS, JCR, IF index),
or output format requirements. Focus only on the core research topic for the search query.
Available search fields and filters:
1. Basic fields:
- title: Search in title
- abstract: Search in abstract
- author: Search for author names
- container-title: Search in journal/conference name
- publisher: Search by publisher name
- type: Filter by work type (journal-article, book-chapter, etc.)
- year: Filter by publication year
2. Boolean operators:
- AND: Both terms must appear
- OR: Either term can appear
- NOT: Exclude terms
- "": Exact phrase match
3. Special filters:
- is-referenced-by-count: Filter by citation count
- from-pub-date: Filter by publication date
- has-abstract: Filter papers with abstracts
Examples:
1. Query: "Machine learning in healthcare after 2020"
<query>title:"machine learning" AND title:healthcare AND from-pub-date:2020</query>
2. Query: "Papers by Geoffrey Hinton about deep learning"
<query>author:"Hinton, Geoffrey" AND (title:"deep learning" OR abstract:"deep learning")</query>
3. Query: "Most cited papers about transformers in Nature"
<query>title:transformer AND container-title:Nature AND is-referenced-by-count:[100 TO *]</query>
4. Query: "Recent BERT applications in medical domain"
<query>title:BERT AND abstract:medical AND from-pub-date:2020 AND type:journal-article</query>
Please analyze the query and respond ONLY with XML tags:
<query>Provide the optimized Crossref search query using appropriate fields and operators</query>"""
# System prompt
CROSSREF_QUERY_SYSTEM_PROMPT = """You are an expert at crafting Crossref search queries.
Your task is to optimize natural language queries for Crossref's API.
Focus on creating precise queries that will return relevant results.
Always generate English search terms regardless of the input language.
Consider using field-specific search terms and appropriate filters to improve search accuracy."""

查看文件

@@ -0,0 +1,47 @@
# 新建文件,添加论文识别提示
PAPER_IDENTIFY_PROMPT = """Analyze the query to identify paper details.
Query: {query}
Task: Extract paper identification information from the query.
Always generate English search terms regardless of the input language.
IMPORTANT: Ignore any requirements about journal ranking (CAS, JCR, IF index),
or output format requirements. Focus only on identifying paper details.
Possible paper identifiers:
1. arXiv ID (e.g., 2103.14030, arXiv:2103.14030)
2. DOI (e.g., 10.1234/xxx.xxx)
3. Paper title (e.g., "Attention is All You Need")
Examples:
1. Query with arXiv ID:
Query: "Analyze paper 2103.14030"
<paper_info>
<paper_source>arxiv</paper_source>
<paper_id>2103.14030</paper_id>
<paper_title></paper_title>
</paper_info>
2. Query with DOI:
Query: "Review the paper with DOI 10.1234/xxx.xxx"
<paper_info>
<paper_source>doi</paper_source>
<paper_id>10.1234/xxx.xxx</paper_id>
<paper_title></paper_title>
</paper_info>
3. Query with paper title:
Query: "Analyze 'Attention is All You Need' paper"
<paper_info>
<paper_source>title</paper_source>
<paper_id></paper_id>
<paper_title>Attention is All You Need</paper_title>
</paper_info>
Please analyze the query and respond ONLY with XML tags containing paper information."""
PAPER_IDENTIFY_SYSTEM_PROMPT = """You are an expert at identifying academic paper references.
Your task is to extract paper identification information from queries.
Look for arXiv IDs, DOIs, and paper titles."""

查看文件

@@ -0,0 +1,108 @@
# PubMed search type prompt
PUBMED_TYPE_PROMPT = """Analyze the research query and determine the appropriate PubMed search type.
Query: {query}
Available search types:
1. basic: General keyword search for medical/biomedical topics
2. author: Search by author name
3. journal: Search within specific journals
4. none: Query not related to medical/biomedical research
Examples:
1. Query: "COVID-19 treatment outcomes"
<search_type>basic</search_type>
2. Query: "Papers by Anthony Fauci"
<search_type>author</search_type>
3. Query: "Recent papers in Nature about CRISPR"
<search_type>journal</search_type>
4. Query: "Deep learning for computer vision"
<search_type>none</search_type>
5. Query: "Transformer architecture for NLP"
<search_type>none</search_type>
Please analyze the query and respond ONLY with XML tags:
<search_type>Choose: basic, author, journal, or none</search_type>"""
# PubMed query optimization prompt
PUBMED_QUERY_PROMPT = """Optimize the following query for PubMed search.
Query: {query}
Task: Transform the natural language query into an optimized PubMed search query.
Requirements:
- Always generate English search terms regardless of input language
- Translate any non-English terms to English before creating the query
- Never include non-English characters in the final query
IMPORTANT: Ignore any requirements about journal ranking (CAS, JCR, IF index),
or output format requirements. Focus only on the core medical/biomedical topic for the search query.
Available field tags:
- [Title] - Search in title
- [Author] - Search for author
- [Journal] - Search in journal name
- [MeSH Terms] - Search using MeSH terms
Boolean operators:
- AND
- OR
- NOT
Examples:
1. Query: "COVID-19 treatment in elderly patients"
<query>COVID-19[Title] AND treatment[Title/Abstract] AND elderly[Title/Abstract]</query>
2. Query: "Cancer immunotherapy review articles"
<query>cancer immunotherapy[Title/Abstract] AND review[Publication Type]</query>
Please analyze the query and respond ONLY with XML tags:
<query>Provide the optimized PubMed search query</query>"""
# PubMed sort parameters prompt
PUBMED_SORT_PROMPT = """Determine optimal sorting parameters for PubMed results.
Query: {query}
Task: Select the most appropriate sorting method and result limit.
Available sort options:
- relevance: Best match to query
- date: Most recent first
- journal: Sort by journal name
Examples:
1. Query: "Latest developments in gene therapy"
<sort_by>date</sort_by>
<limit>30</limit>
2. Query: "Classic papers about DNA structure"
<sort_by>relevance</sort_by>
<limit>20</limit>
Please analyze the query and respond ONLY with XML tags:
<sort_by>Choose: relevance, date, or journal</sort_by>
<limit>Suggest number between 10-50</limit>"""
# System prompts
PUBMED_TYPE_SYSTEM_PROMPT = """You are an expert at analyzing medical and scientific queries.
Your task is to determine the most appropriate PubMed search type.
Consider the query's focus and intended search scope.
Always respond in English regardless of the input language."""
PUBMED_QUERY_SYSTEM_PROMPT = """You are an expert at crafting PubMed search queries.
Your task is to optimize natural language queries using PubMed's search syntax.
Focus on creating precise, targeted queries that will return relevant medical literature.
Always generate English search terms regardless of the input language."""
PUBMED_SORT_SYSTEM_PROMPT = """You are an expert at optimizing PubMed search results.
Your task is to determine the best sorting parameters based on the query context.
Consider the balance between relevance and recency.
Always respond in English regardless of the input language."""

查看文件

@@ -0,0 +1,276 @@
# Search type prompt
SEMANTIC_TYPE_PROMPT = """Determine the most appropriate search type for Semantic Scholar.
Query: {query}
Task: Analyze the research query and select the most appropriate search type for Semantic Scholar API.
Available search types:
1. paper: General paper search
- Use for broad topic searches
- Looking for specific papers
- Keyword-based searches
Example: "transformer models in NLP"
2. author: Author-based search
- Finding works by specific researchers
- Author profile analysis
Example: "papers by Yoshua Bengio"
3. paper_details: Specific paper lookup
- Getting details about a known paper
- Finding specific versions or citations
Example: "Attention is All You Need paper details"
4. citations: Citation analysis
- Finding papers that cite a specific work
- Impact analysis
Example: "papers citing BERT"
5. references: Reference analysis
- Finding papers cited by a specific work
- Background research
Example: "references in GPT-3 paper"
6. recommendations: Paper recommendations
- Finding similar papers
- Research direction exploration
Example: "papers similar to Transformer"
Examples:
1. Query: "Latest papers about deep learning"
<search_type>paper</search_type>
2. Query: "Works by Geoffrey Hinton since 2020"
<search_type>author</search_type>
3. Query: "Papers citing the original Transformer paper"
<search_type>citations</search_type>
Please analyze the query and respond ONLY with XML tags:
<search_type>Choose the most appropriate search type from the list above</search_type>"""
# Query optimization prompt
SEMANTIC_QUERY_PROMPT = """Optimize the following query for Semantic Scholar search.
Query: {query}
Task: Transform the natural language query into an optimized search query for maximum relevance.
Always generate English search terms regardless of the input language.
IMPORTANT: Ignore any requirements about journal ranking (CAS, JCR, IF index),
or output format requirements. Focus only on the core research topic for the search query.
Query optimization guidelines:
1. Use quotes for exact phrases
- Ensures exact matching
- Reduces irrelevant results
Example: "\"attention mechanism\"" vs attention mechanism
2. Include key technical terms
- Use specific technical terminology
- Include common variations
Example: "transformer architecture" neural networks
3. Author names (if relevant)
- Include full names when known
- Consider common name variations
Example: "Geoffrey Hinton" OR "G. E. Hinton"
Examples:
1. Natural query: "Recent advances in transformer models"
<query>"transformer model" "neural architecture" deep learning</query>
2. Natural query: "BERT applications in text classification"
<query>"BERT" "text classification" "language model" application</query>
3. Natural query: "Deep learning for computer vision by Kaiming He"
<query>"deep learning" "computer vision" author:"Kaiming He"</query>
Please analyze the query and respond ONLY with XML tags:
<query>Provide the optimized search query</query>
Note:
- Balance between specificity and coverage
- Include important technical terms
- Use quotes for key phrases
- Consider synonyms and related terms"""
# Fields selection prompt
SEMANTIC_FIELDS_PROMPT = """Select relevant fields to retrieve from Semantic Scholar.
Query: {query}
Task: Determine which paper fields should be retrieved based on the research needs.
Available fields:
Core fields:
- title: Paper title (always included)
- abstract: Full paper abstract
- authors: Author information
- year: Publication year
- venue: Publication venue
Citation fields:
- citations: Papers citing this work
- references: Papers cited by this work
Additional fields:
- embedding: Paper vector embedding
- tldr: AI-generated summary
- venue: Publication venue/journal
- url: Paper URL
Examples:
1. Query: "Latest developments in NLP"
<fields>title, abstract, authors, year, venue, citations</fields>
2. Query: "Most influential papers in deep learning"
<fields>title, abstract, authors, year, citations, references</fields>
3. Query: "Survey of transformer architectures"
<fields>title, abstract, authors, year, tldr, references</fields>
Please analyze the query and respond ONLY with XML tags:
<fields>List relevant fields, comma-separated</fields>
Note:
- Choose fields based on the query's purpose
- Include citation data for impact analysis
- Consider tldr for quick paper screening
- Balance completeness with API efficiency"""
# Sort parameters prompt
SEMANTIC_SORT_PROMPT = """Determine optimal sorting parameters for the query.
Query: {query}
Task: Select the most appropriate sorting method and result limit for the search.
Always generate English search terms regardless of the input language.
Sorting options:
1. relevance (default)
- Best match to query terms
- Recommended for specific technical searches
Example: "specific algorithm implementations"
2. citations
- Sort by citation count
- Best for finding influential papers
Example: "most important papers in deep learning"
3. year
- Sort by publication date
- Best for following recent developments
Example: "latest advances in NLP"
Examples:
1. Query: "Recent breakthroughs in AI"
<sort_by>year</sort_by>
<limit>30</limit>
2. Query: "Most influential papers about GANs"
<sort_by>citations</sort_by>
<limit>20</limit>
3. Query: "Specific papers about BERT fine-tuning"
<sort_by>relevance</sort_by>
<limit>25</limit>
Please analyze the query and respond ONLY with XML tags:
<sort_by>Choose: relevance, citations, or year</sort_by>
<limit>Suggest number between 10-50</limit>
Note:
- Consider the query's temporal aspects
- Balance between comprehensive coverage and information overload
- Use citation sorting for impact analysis
- Use year sorting for tracking developments"""
# System prompts for each task
SEMANTIC_TYPE_SYSTEM_PROMPT = """You are an expert at analyzing academic queries.
Your task is to determine the most appropriate type of search on Semantic Scholar.
Consider the query's intent, scope, and specific research needs.
Always respond in English regardless of the input language."""
SEMANTIC_QUERY_SYSTEM_PROMPT = """You are an expert at crafting Semantic Scholar search queries.
Your task is to optimize natural language queries for maximum relevance.
Focus on creating precise queries that leverage the platform's search capabilities.
Always generate English search terms regardless of the input language."""
SEMANTIC_FIELDS_SYSTEM_PROMPT = """You are an expert at Semantic Scholar data fields.
Your task is to select the most relevant fields based on the research context.
Consider both essential and supplementary information needs.
Always respond in English regardless of the input language."""
SEMANTIC_SORT_SYSTEM_PROMPT = """You are an expert at optimizing search results.
Your task is to determine the best sorting parameters based on the query context.
Consider the balance between relevance, impact, and recency.
Always respond in English regardless of the input language."""
# 添加新的综合搜索提示词
SEMANTIC_SEARCH_PROMPT = """Analyze and optimize the research query for Semantic Scholar search.
Query: {query}
Task: Transform the natural language query into optimized search criteria for Semantic Scholar.
IMPORTANT: Ignore any requirements about journal ranking (CAS, JCR, IF index),
or output format requirements when generating the search terms. These requirements
should be considered only for post-search filtering, not as part of the core query.
Available search options:
1. Paper search:
- Title and abstract search
- Author search
- Field-specific search
Example: "transformer architecture neural networks"
2. Field tags:
- title: Search in title
- abstract: Search in abstract
- authors: Search by author names
- venue: Search by publication venue
Example: "title:transformer authors:\"Vaswani\""
3. Advanced options:
- Year range filtering
- Citation count filtering
- Venue filtering
Example: "deep learning year>2020 venue:\"NeurIPS\""
Examples:
1. Query: "Recent transformer papers by Vaswani with high impact"
<search_criteria>
<query>title:transformer authors:"Vaswani" year>2017</query>
<search_type>paper</search_type>
<fields>title,abstract,authors,year,citations,venue</fields>
<sort_by>citations</sort_by>
<limit>30</limit>
</search_criteria>
2. Query: "Most cited papers about BERT in top conferences"
<search_criteria>
<query>title:BERT venue:"ACL|EMNLP|NAACL"</query>
<search_type>paper</search_type>
<fields>title,abstract,authors,year,citations,venue,references</fields>
<sort_by>citations</sort_by>
<limit>25</limit>
</search_criteria>
Please analyze the query and respond with XML tags containing complete search criteria."""
SEMANTIC_SEARCH_SYSTEM_PROMPT = """You are an expert at crafting Semantic Scholar search queries.
Your task is to analyze research queries and transform them into optimized search criteria.
Consider query intent, field relevance, and citation impact when creating the search parameters.
Focus on producing precise and comprehensive search criteria that will yield the most relevant results.
Always generate English search terms and respond in English regardless of the input language."""