<?xml version="1.0"?><rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:edm="http://www.europeana.eu/schemas/edm/" xmlns:wgs84_pos="http://www.w3.org/2003/01/geo/wgs84_pos" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdaGr2="http://rdvocab.info/ElementsGr2" xmlns:oai="http://www.openarchives.org/OAI/2.0/" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ore="http://www.openarchives.org/ore/terms/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:dcterms="http://purl.org/dc/terms/"><edm:WebResource rdf:about="http://www.dlib.si/stream/URN:NBN:SI:DOC-O44E68DC/7b6bdc04-80a3-4c29-8083-cc4dcfc3ead0/PDF"><dcterms:extent>928 KB</dcterms:extent></edm:WebResource><edm:WebResource rdf:about="http://www.dlib.si/stream/URN:NBN:SI:DOC-O44E68DC/6ad86197-6457-4137-9072-c91e6a17c5ca/TEXT"><dcterms:extent>64 KB</dcterms:extent></edm:WebResource><edm:TimeSpan rdf:about="2010-2026"><edm:begin xml:lang="en">2010</edm:begin><edm:end xml:lang="en">2026</edm:end></edm:TimeSpan><edm:ProvidedCHO rdf:about="URN:NBN:SI:DOC-O44E68DC"><dcterms:isPartOf rdf:resource="https://www.dlib.si/details/URN:NBN:SI:spr-WHDNAMJH" /><dcterms:issued>2026</dcterms:issued><dc:creator>Jurišić, Marko</dc:creator><dc:creator>Tomičić, Igor</dc:creator><dc:format xml:lang="sl">str. 1-24</dc:format><dc:format xml:lang="sl">številka:vol. 28</dc:format><dc:identifier>ISSN:2232-2981</dc:identifier><dc:identifier>COBISSID_HOST:273493251</dc:identifier><dc:identifier>URN:URN:NBN:SI:doc-O44E68DC</dc:identifier><dc:language>en</dc:language><dc:publisher xml:lang="sl">Univerza v Mariboru, Fakulteta za varnostne vede</dc:publisher><dcterms:isPartOf xml:lang="sl">Varstvoslovje</dcterms:isPartOf><dc:subject xml:lang="en">anomaly detection</dc:subject><dc:subject xml:lang="sl">CERT</dc:subject><dc:subject xml:lang="en">CERT dataset</dc:subject><dc:subject xml:lang="en">dataset bias</dc:subject><dc:subject xml:lang="en">evaluation metrics</dc:subject><dc:subject xml:lang="sl">evalvacijske metrike</dc:subject><dc:subject xml:lang="en">insider threat detection</dc:subject><dc:subject xml:lang="en">machine learning</dc:subject><dc:subject xml:lang="sl">nabor podatkov</dc:subject><dc:subject xml:lang="sl">odkrivanje notranjih groženj</dc:subject><dc:subject xml:lang="sl">pristranskost nabora podatkov</dc:subject><dc:subject xml:lang="sl">strojno učenje</dc:subject><dc:subject xml:lang="sl">zaznavanje anomalij</dc:subject><dcterms:temporal rdf:resource="2010-2026" /><dc:title xml:lang="sl">The Cert dataset decade| a systematic review of methodological evolution and performance bias|</dc:title><dc:description xml:lang="sl">Purpose: The purpose of this paper is to identify methodological biases and limitations in machine learning–based insider threat detection using the Computer Emergency Response Team CERT dataset, in order to guide the development of more realistic, robust, and operationally relevant detection approaches. Design/Methods/Approach: The objectives are achieved through a systematic literature analysis of 131 peer-reviewed studies published between 2013 and 2025 that apply machine learning to insider threat detection using the CERT dataset, employing a Preferred Reporting Items for Systematic Reviews and Meta-Analyses PRISMA-guided selection process and a structured comparative framework to examine dataset versions, feature engineering strategies, model architectures, and evaluation metrics from a methodological and empirical perspective. Findings: The analysis shows that most studies rely on the less realistic CERT v4.2 dataset, resulting in inflated performance that does not generalize to operational settings. It also finds that feature engineering is a stronger determinant of detection performance than model complexity, while inconsistent evaluation practices hinder meaningful comparison across studies. Research Limitations / Implications: The study is limited by its reliance on published research using a single synthetic dataset, which constrains generalization to real-world environments. Practical Implications: The findings indicate that practitioners should be cautious when adopting models validated on simplified benchmark settings, and instead prioritize solutions tested under extreme class imbalance. Emphasis should be placed on robust feature engineering, unsupervised or hybrid detection approaches, and evaluation metrics. Originality/Value: This paper provides the first large-scale, methodologically focused analysis of insider threat detection research that explicitly exposes performance inflation caused by dataset version bias and evaluation inconsistency, offering concrete, evidence-based guidance for improving the realism, comparability, and operational value of future studies in the field</dc:description><dc:description xml:lang="sl">Namen prispevka: Namen prispevka je opredeliti metodološke pristranskosti in omejitve pri odkrivanju notranjih groženj na osnovi strojnega učenja z uporabo nabora podatkov CERT, da bi usmerili razvoj bolj realističnih, robustnih in operativno uporabnih pristopov za zaznavanje. Metode: Cilji so doseženi s sistematično analizo literature 131 recenziranih študij, objavljenih med letoma 2013 in 2025, ki uporabljajo strojno učenje za odkrivanje notranjih groženj na podlagi nabora podatkov CERT. Uporabljena sta bila postopek izbora po smernicah Prednostne postavke poročanja za sistematične preglede in metaanalize (angl. PRISMA – Preferred Reporting Items for Systematic Reviews and Meta-Analyses) ter strukturiran primerjalni okvir za proučevanje različic nabora podatkov, strategij značilnosti inženiringa, arhitektur modelov in evalvacijskih metrik z metodološkega in empiričnega vidika. Ugotovitve: Analiza kaže, da se večina študij zanaša na manj realističen nabor podatkov CERT v4.2, kar vodi do precenjenih rezultatov zmogljivosti, ki se ne posplošujejo na operativna okolja. Poleg tega ugotavlja, da je značilnost inženiringa pomembnejši dejavnik uspešnosti zaznavanja kot kompleksnost modelov, medtem ko nedosledne evalvacijske prakse otežujejo smiselno primerjavo med študijami. Omejitve/uporabnost raziskave: Študija je omejena zaradi zanašanja na objavljeno literaturo, ki uporablja en sam sintetični nabor podatkov, kar omejuje posploševanje na resnična okolja. Praktična uporabnost: Ugotovitve kažejo, da bi morali biti praktiki previdni pri uvajanju modelov, validiranih na poenostavljenih referenčnih okoljih, ter namesto tega dajati prednost rešitvam, preizkušenim v pogojih izrazite neuravnoteženosti razredov. Poudarek bi moral biti na značilnosti robustnega inženiringa, nenadzorovanih ali hibridnih pristopih zaznavanja ter evalvacijskih metrikah. Izvirnost/pomembnost prispevka: Prispevek predstavlja prvo obsežno, metodološko usmerjeno analizo raziskav na področju odkrivanja notranjih groženj, ki izrecno razkriva precenjenost rezultatov zmogljivosti zaradi pristranskosti različic naborov podatkov in nedoslednosti evalvacije ter ponuja konkretna, na dokazih temelječa priporočila za izboljšanje realističnosti, primerljivosti in operativne vrednosti prihodnjih raziskav na tem področju</dc:description><edm:type>TEXT</edm:type><dc:type xml:lang="sl">znanstveno časopisje</dc:type><dc:type xml:lang="en">journals</dc:type><dc:type rdf:resource="http://www.wikidata.org/entity/Q361785" /></edm:ProvidedCHO><ore:Aggregation rdf:about="http://www.dlib.si/?URN=URN:NBN:SI:DOC-O44E68DC"><edm:aggregatedCHO rdf:resource="URN:NBN:SI:DOC-O44E68DC" /><edm:isShownBy rdf:resource="http://www.dlib.si/stream/URN:NBN:SI:DOC-O44E68DC/7b6bdc04-80a3-4c29-8083-cc4dcfc3ead0/PDF" /><edm:rights rdf:resource="http://creativecommons.org/licenses/by/4.0/" /><edm:provider>Slovenian National E-content Aggregator</edm:provider><edm:intermediateProvider xml:lang="en">National and University Library of Slovenia</edm:intermediateProvider><edm:dataProvider xml:lang="sl">Univerza v Mariboru, Fakulteta za varnostne vede</edm:dataProvider><edm:object rdf:resource="http://www.dlib.si/streamdb/URN:NBN:SI:DOC-O44E68DC/maxi/edm" /><edm:isShownAt rdf:resource="http://www.dlib.si/details/URN:NBN:SI:DOC-O44E68DC" /></ore:Aggregation></rdf:RDF>