Search Results - (evolution OR evaluation)

Refine Results
  1. SimEval-IR: A Unified Toolkit and Benchmark Suite for Evaluating User Simulators and Search Sessions

    Published in ArXiv cs.IR Recent Papers (2026)
    Get full text
    Online Article RSS Article
  2. Real-Time GPU-Accelerated Monte Carlo Evaluation of Safety-Critical AEB Systems Under Uncertainty

    Published in ArXiv cs.DC Recent Papers (2026)
    Get full text
    Online Article RSS Article
  3. Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future

    Published in ArXiv cs.CL Recent Papers (2026)
    Get full text
    Online Article RSS Article
  4. LearnMate^2: Design and Evaluation of an LLM-powered Personalized and Adaptive Support System for Online Learning

    Published in ArXiv cs.HC Recent Papers (2026)
    Get full text
    Online Article RSS Article
  5. The Missing Evaluation Axis: What 10,000 Student Submissions Reveal About AI Tutor Effectiveness

    Published in ArXiv cs.HC Recent Papers (2026)
    Get full text
    Online Article RSS Article
  6. PolySQL: Scaling Text-to-SQL Evaluation Across SQL Dialects via Automated Backend Isomorphism

    Published in ArXiv cs.CL Recent Papers (2026)
    Get full text
    Online Article RSS Article
  7. SmartEval: A Benchmark for Evaluating LLM-Generated Smart Contracts from Natural Language Specifications

    Published in ArXiv cs.MA Recent Papers (2026)
    Get full text
    Online Article RSS Article
  8. LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

    Published in ArXiv cs.CL Recent Papers (2026)
    Get full text
    Online Article RSS Article
  9. A Comparative Study of Controlled Text Generation Systems Using Level-Playing-Field Evaluation Principles

    Published in ArXiv cs.CL Recent Papers (2026)
    Get full text
    Online Article RSS Article
  10. Equivalence of eval-readback and eval-apply big-step evaluators by structuring the lambda-calculus's strategy space

    Get full text
    Online Article RSS Article
  11. Feasibility trial of a transdiagnostic individual lifestyle evaluation and intervention (Lev-i) for health behavior change

    Published in PLOS ONE (2026)
    Get full text
    Online Article RSS Article
  12. Agentic AI in Healthcare and Medicine: A Seven-Dimensional Taxonomy for Empirical Evaluation of LLM-Based Agents

    Published in IEEE Access (2026)
    Get full text
    Online Article RSS Article
  13. Comparative evaluation of large language models in generating clinical insights for HIV associated oral Kaposi sarcoma

    Published in Discover AI (2026)
    Get full text
    Online Article RSS Article
  14. Integrating DWT and Bayesian Neural Networks for Effective Bearing Fault Detection With Uncertainty Evaluation in Induction Machines

    Published in IEEE Access (2026)
    Get full text
    Online Article RSS Article
  15. Correction: Diagnostic performance of eNose technology in detecting colorectal cancer recurrence: A prospective evaluation

    Published in PLOS ONE (2026)
    Get full text
    Online Article RSS Article
  16. Evaluation of morphological variations of mandibular bone in adult bruxers using CBCT: A cross-sectional study

    Published in PLOS ONE (2026)
    Get full text
    Online Article RSS Article
  17. Design and evaluation of structural risk mitigation measures for transmission lines micro-pile foundations in mountainous region

    Published in PLOS ONE (2026)
    Get full text
    Online Article RSS Article
  18. Vigi4Eudra-score: Evaluation of the completeness of spontaneous adverse drug reaction reports in EudraVigilance

    Published in PLOS ONE (2026)
    Get full text
    Online Article RSS Article