Results for ' corpus web'

1000+ found
Order:
  1.  5
    Hyperbase Web. (Hyper)Bases, Corpus, Langage.Laurent Vanni - 2024 - Corpus 25.
    Hyperbase est un logiciel d’Analyse de Données Textuelles (ADT) qui offre une suite d’outils statistiques dédiés à l’étude de corpus. Initialement développé sur ordinateur de bureau, il se décline depuis 2015 en plateforme web offrant une interface à l’ergonomie travaillée pour un usage tourné vers les sciences humaines et sociales. Après un rappel méthodologique de l’ADT, cette contribution présente Hyperbase Web version 2024, à partir d’exemples concrets d’usages, de notes techniques ainsi que des entrées par le menu (manuel d’utilisateur). (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  2. Computational Linguistics Research-Corpus-Based Knowledge Acquisition-Web-Based Measurements of Intra-collocational Cohesion in Oxford Collocations Dictionary.Igor A. Bolshakov & Sofia N. Galicia-Haro - 2006 - In O. Stock & M. Schaerf (eds.), Lecture Notes in Computer Science. Springer Verlag. pp. 3878--93.
    No categories
     
    Export citation  
     
    Bookmark  
  3.  11
    Web corpora through the lens of Call.Eva Schaeffer-Lacroix - 2020 - Corpus 20.
    Cet article s’intéresse au potentiel des corpus web pour l’enseignement-apprentissage des langues étrangères. Les corpus web sont de très grands ensembles textuels provenant d’Internet. Leur constitution est largement automatisée, ce qui entraîne certaines caractéristiques qui peuvent laisser perplexes les spécialistes de l’apprentissage des langues médiées par les technologies (ALMT). Toutefois, les arguments suivants en leur faveur peuvent convaincre non seulement les linguistes, mais aussi les didacticiens : les corpus web contiennent des quantités de données très importantes permettant (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  4.  10
    Sémiotique des textscapes_: quelle contribution du _textscape linguistique à la mise en scène des langues dans un corpus de sites web?Marie-Hélène Hermand - 2023 - Semiotica 2023 (252):1-26.
    Résumé Cet article propose une réflexion théorique et méthodologique permettant de décrire et d’interpréter les textscapes à des fins d’analyse communicationnelle. L’objectif consiste à tester le concept sémiotique de textscape linguistique pour analyser la mise en scène des langues dans la communication des organisations sur le web. Le cadre théorique fait appel aux études du paysage linguistique (Linguistic Landscape Studies) et à la théorie des paysages textuels (textscapes). Le corpus est composé de 100 sites web d’organisations économiques (chambres de (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  5.  42
    Corpus Linguistics Methods in the Study of (Meta)Argumentation.Martin Hinton - 2020 - Argumentation 35 (3):435-455.
    As more and more sophisticated software is created to allow the mining of arguments from natural language texts, this paper sets out to examine the suitability of the well-established and readily available methods of corpus linguistics to the study of argumentation. After brief introductions to corpus linguistics and the concept of meta-argument, I describe three pilot-studies into the use of the terms Straw man, Ad hominem, and Slippery slope, made using the open access News on the Web (...). The presence of each of these phrases on internet news sites was investigated and assessed for correspondence to the norms of use by argumentation theorists. All three pilot-studies revealed interesting facts about the usage of the terms by non-specialists, and led to numerous examples of the types of arguments mentioned. This suggests such corpora may be of use in two different ways: firstly, the wider project of improving public debate and educating the populace in the skills of critical thinking can only be helped by a better understanding of the current state of knowledge of the technical terms and concepts of argumentation. Secondly, theorists could obtain a more accurate picture of how arguments are used, by whom, and to what reception, allowing claims on such matters to be evidence, rather than intuition, based. (shrink)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   5 citations  
  6.  15
    A Web-based Database for Drawings of Gods.Zhargalma Dandarova Robert, Grέgory Dessart, Olga Serbaeva, Camelia Puzdriac, Mohammad Khodayarifard, Saeed Akbari Zardkhaneh, Saeid Zandi, Elena Petanova, Kevin L. Ladd & Pierre-Yves Brandt - 2016 - Archive for the Psychology of Religion 38 (3):345-352.
    This original web-based database was developed at the University of Lausanne as part of the international research project “Drawings of gods”, which explores children's representations of supernatural agents. Its primary purpose is to store and organize data and metadata to be easily accessible to all affiliated researchers. However, anyone interested in the matter can view the drawings, as they were made publicly available. At present, our corpus is composed of over 5'100 drawings collected in different parts of the world (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  7.  6
    Multiple Corpus: a Polyangular Readings Approach?Margherita Fantoli & Marc Vandersmissen - 2018 - Corpus 19.
    Dans le domaine de l’analyse des données textuelles (ADT), les chercheurs s’intéressent à la relation entre le texte et son support d’exploration. Ces dernières années, l’évolution de l’informatique a profondément modifié notre rapport au texte induisant le développement de nouveaux outils d’étude et critères d’analyse. Dans ce contexte théorique, le concept de lecture polyangulaire permet de compléter les notions de lectures linéaire, réticulaire ou matricielle. Cette approche du texte est devenue possible grâce aux outils d’édition de corpus toujours plus (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  8.  7
    Les corpus de la communication médiée par les réseaux : une introduction.Céline Poudat, Ciara R. Wigham & Loïc Liégeois - 2020 - Corpus 20.
    Si le développement du web a rendu accessibles des masses de données numériques, facilitant la collecte de textes et le développement de corpus, il a également donné naissance à de nouveaux genres qui défient les représentations, les méthodes et les grilles d’analyses développées jusqu’à présent. Ainsi a-t-on vu apparaître des corpus assez éloignés des premiers corpus écrits traditionnels, regroupés sous la bannière de la CMR (Communication Médiée par les Réseaux / Computer-Mediated Communica...
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  9.  17
    The Challenges of Large‐Scale, Web‐Based Language Datasets: Word Length and Predictability Revisited.Stephan C. Meylan & Thomas L. Griffiths - 2021 - Cognitive Science 45 (6):e12983.
    Language research has come to rely heavily on large‐scale, web‐based datasets. These datasets can present significant methodological challenges, requiring researchers to make a number of decisions about how they are collected, represented, and analyzed. These decisions often concern long‐standing challenges in corpus‐based language research, including determining what counts as a word, deciding which words should be analyzed, and matching sets of words across languages. We illustrate these challenges by revisiting “Word lengths are optimized for efficient communication” (Piantadosi, Tily, & (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   5 citations  
  10.  5
    Detection of extremist messages in web resources in the Kazakh language.Shynar Mussiraliyeva & Milana Bolatbek - 2023 - Lodz Papers in Pragmatics 19 (2):415-425.
    Currently, the Internet information and communication network has become an integral part of human life. People use social networks such as Twitter, VKontakte, Facebook, etc., to establish global contacts, exchange opinions, gain knowledge, etc. The active participation of not only individual users, but also information organizations in the entire world space makes it necessary to develop measures that correspond to modern trends in the development of information and communication technologies to ensure national security, in particular, the organization of events related (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  11.  35
    Using data-mining to identify and study patterns in lexical innovation on the web.Daphné Kerremans, Jelena Prokić, Quirin Würschinger & Hans-Jörg Schmid - 2018 - Pragmatics and Cognition 25 (1):174-200.
    This paper presents the NeoCrawler – a tailor-made webcrawler, which identifies and retrieves neologisms from the Internet and systematically monitors the use of detected neologisms on the web by means of weekly searches. It enables researchers to use the web as a corpus in order to investigate the dynamics of lexical innovation on a large-scale and systematic basis. The NeoCrawler represents an innovative web-mining tool which opens up new opportunities for linguists to tackle a number of unresolved and under-researched (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  12.  25
    Finding variants for construction-based dialectometry: A corpus-based approach to regional CxGs.Jonathan Dunn - 2018 - Cognitive Linguistics 29 (2):275-311.
    This paper develops a construction-based dialectometry capable of identifying previously unknown constructions and measuring the degree to which a given construction is subject to regional variation. The central idea is to learn a grammar of constructions using construction grammar induction and then to use these constructions as features for dialectometry. This offers a method for measuring the aggregate similarity between regional CxGs without limiting in advance the set of constructions subject to variation. The learned CxG is evaluated on how well (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  13.  6
    Exploiting Context for Biomedical Entity Recognition: From Syntax to the Web.Christopher Manning - unknown
    We describe a machine learning system for the recognition of names in biomedical texts. The system makes extensive use of local and syntactic features within the text, as well as external resources including the web and gazetteers. It achieves an F- score of 70% on the Coling 2004 NLPBA/BioNLP shared task of identifying five biomedical named entities in the GENIA corpus.
    Direct download  
     
    Export citation  
     
    Bookmark  
  14.  5
    Analysis and Visualization of Road Accidents Using Heatmaps Based on Web Data.Lejla Abazi Bexheti & Luan Sinanaj - 2023 - Seeu Review 18 (2):176-190.
    Road accidents have increased rapidly in recent years for a variety of reasons. Analyzing and visualizing road accidents through heatmaps can help improve policies for their prevention by informing about areas with a high-risk of road accidents. The purpose of this research is to build a model for the analysis and visualization of road accidents through heatmaps. Information about road accidents is extracted from the news of the main online media portals through scripts in the Python language and Web Scraping (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  15.  5
    Lexical Profile of Newspapers Revisited: A Corpus-Based Analysis.Hung Tan Ha - 2022 - Frontiers in Psychology 13.
    The present study analyzed the vocabulary profile of the News on the Web corpus, which contained 12 billion words from online newspapers and magazines in 20 countries to determine the vocabulary knowledge needed to reasonably understand online newspaper and magazine articles. The results showed that, in general, knowledge of the most frequent 4,000 word families in the British National Corpus/Corpus of Contemporary American English wordlist plus proper nouns, marginal words, transparent compounds and acronyms was necessary to gain (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  16.  24
    Constitution et exploitation d'un corpus de français parlé parisien.Sonia Branca-Rosoff, Serge Fleury, Florence Lefeuvre & Matthew Pires - 2011 - Corpus 10:81-98.
    Le but de cet article est double. Il s’agit d’abord d’introduire un nouveau corpus de français oral numérisé, accessible sans restriction sur le web. CFPP2000 (Corpus du français parlé parisien des années 2000), qui comporte actuellement 500 000 mots alignés à l’oral au tour de parole, est constitué par un ensemble d’interviews conversationnelles sur les quartiers de Paris d’une à deux heures qui ont été réalisées en dyades ou le plus souvent en triades. L’article envisage l’influence pour la (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  17.  11
    Constitution et exploitation d’un corpus de français parlé parisien.Sonia Branca-Rosoff, Serge Fleury, Florence Lefeuvre & Matthew Pires - 2011 - Corpus 10:81-98.
    Le but de cet article est double. Il s’agit d’abord d’introduire un nouveau corpus de français oral numérisé, accessible sans restriction sur le web. CFPP2000 (Corpus du français parlé parisien des années 2000), qui comporte actuellement 500 000 mots alignés à l’oral au tour de parole, est constitué par un ensemble d’interviews conversationnelles sur les quartiers de Paris d’une à deux heures qui ont été réalisées en dyades ou le plus souvent en triades. L’article envisage l’influence pour la (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  18.  6
    La infancia de San Gerardo Mayela: de la hagiografía, al cómic y a la web.Marco Papasidero - 2023 - 'Ilu. Revista de Ciencias de Las Religiones 28:e85217.
    A través de este artículo pretendo analizar la infancia de San Gerardo Mayela (1726-1755), hermano laico italiano de la congregación del Santísimo Redentor. El objetivo no es tanto verificar qué aspectos de sus hagiografías corresponden a la verdad, sino examinar la compleja estratificación de las tradiciones sobre la infancia del santo, para comprender cómo se formuló y cómo, incluso en años más recientes, se narra. Con este fin examinaré un rico corpus de fuentes, realizado entre el siglo XVIII y (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  19. The Battle of Samoa Revisited.Web Censoring Widens Across Southeast Asia - forthcoming - Journal of Information Ethics.
     
    Export citation  
     
    Bookmark  
  20.  13
    Ontology, Semantic Web, Creativity.Semantic Web - 2011 - In Thomas Bartscherer (ed.), Switching Codes. Chicago University Press. pp. 101.
    Direct download  
     
    Export citation  
     
    Bookmark  
  21. From the office.Web Access Advice & Citizenship Sev Teacher - 2013 - Ethos: Social Education Victoria 21 (1):4.
    No categories
     
    Export citation  
     
    Bookmark  
  22. Planning and Decision Making.Eldercare Web - 2000 - Bioethics Forum 15 (4):57.
     
    Export citation  
     
    Bookmark  
  23. El docente como sujeto pedagógico en los nuevos tiempos.Juan Manuel Silva Corpus - 2014 - In David Castillo Careaga & Juana Arriaga Méndez (eds.), Formación e identidad docente: aproximaciones desde la práctica. Monterrey, Nuevo León, Mexico: Escuela de Ciencias de la Educación.
     
    Export citation  
     
    Bookmark  
  24. Ouvrages envoyes a la redaction.Corpus Christianorum Continuatio Mediaevalis - 1984 - Nouvelle Revue Théologique 106:317.
    No categories
     
    Export citation  
     
    Bookmark  
  25.  11
    Cultural change see extra-linguistic/cultural change decision tree analysis 211–212 see also multivariate analysis delocutive change 281–283. [REVIEW]Helsinki Corpus, N. -Gram Corpus & Oxford English Corpus - 2011 - In Kathryn Allan & Justyna A. Robinson (eds.), Current Methods in Historical Semantics. De Gruyter Mouton. pp. 343.
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  26. Hermetica the Ancient Greek and Latin Writings Which Contain Religious or Philosophic Teachings Ascribed to Hermes Trismegistus.Walter Corpus Hermeticum, A. S. Scott & Ferguson - 1924 - Clarendon Press.
    No categories
     
    Export citation  
     
    Bookmark  
  27.  12
    An integrated explicit and implicit offensive language taxonomy.Barbara Lewandowska-Tomaszczyk, Anna Bączkowska, Chaya Liebeskind, Giedre Valunaite Oleskeviciene & Slavko Žitnik - 2023 - Lodz Papers in Pragmatics 19 (1):7-48.
    The current study represents an integrated model of explicit and implicit offensive language taxonomy. First, it focuses on a definitional revision and enrichment of the explicit offensive language taxonomy by reviewing the collection of available corpora and comparing tagging schemas applied there. The study relies mainly on the categories originally proposed by Zampieri et al. (2019) in terms of offensive language categorization schemata. After the explanation of semantic differences between particular concepts used in the tagging systems and the analysis of (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  28.  11
    The Relationship Between Word Length and Average Information Content in Japanese.Yuki Tanida - 2023 - Cognitive Science 47 (6):e13302.
    Piantadosi, Tily, and Gibson analyzed a large‐scale web‐scraping corpus (the Google 1T dataset) and reported that word length is independently predicted from average information content (surprisal) calculated by a 2‐ to 4‐gram model (hereafter, longer‐span surprisal) across 11 Indo‐European languages, namely, Czech, Dutch, English, French, German, Italian, Polish, Spanish, Portuguese, Romanian, and Swedish. However, a recent article by Meylan and Griffiths suggested the importance of preprocessing for studies with large‐scale corpora and reanalyzed the same databases. After their preprocessing, the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  29.  10
    Discourse on climate and energy justice: a comparative study of Do It Yourself and Bootstrapped corpora.Camille Biros, Caroline Rossi & Inesa Sahakyan - 2018 - Corpus 18.
    This article offers a descriptive and analytic view of the different stages leading to the constitution of a corpus that is representative of the issues of climate and energy justice. Overall, the corpus contains over five million words and gathers reports, newsletters and web-pages dealing with the most equitable ways of moving to a low-carbon future in the aim of limiting climate change. It can be divided into six sub-corpora, according to types of discourse communities, and methods of (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  30. Examination dialogue: An argumentation framework for critically questioning an expert opinion.Douglas Walton - manuscript
     
    Export citation  
     
    Bookmark   19 citations  
  31. Dialogical models of explanation.Douglas Walton - manuscript
    Explanation-Aware Computing: Papers from the 2007 AAAI Workshop, Association for the Advancement of Artificial Intelligence, Technical Report WS-07-06, Menlo Park California, AAAI Press, 2007, 1-9.
     
    Export citation  
     
    Bookmark   7 citations  
  32. Common knowledge in argumentation.Douglas Walton & Fabrizio Macagno - manuscript
    Studies in Communication Sciences, 6, 2006, 3-26 . [link to online version posted].
     
    Export citation  
     
    Bookmark   5 citations  
  33. The carneades model of argument and burden of proof.Douglas Walton - manuscript
    with Thomas F. Gordon and Henry Prakken. Artificial Intelligence, forthcoming. [Preprint posted.].
     
    Export citation  
     
    Bookmark   21 citations  
  34. Tracking multiple independent targets: Evidence for a parallel tracking mechanism.Zenon Pylyshyn - manuscript
  35. Reasoning in biological discoveries.Lindley Darden - manuscript
     
    Export citation  
     
    Bookmark   82 citations  
  36.  72
    Modeling the structure of recent philosophy.Maximilian Noichl - 2019 - Synthese 198 (6):5089-5100.
    This paper presents an approach of unsupervised learning of clusters from a citation database, and applies it to a large corpus of articles in philosophy to give an account of the structure of the discipline. Following a list of journals from the PhilPapers-archive, 68,152 records were downloaded from the Reuters Web of Science-Database. Their citation data was processed using dimensionality reduction and clustering. The resulting clusters were identified, and the results are graphically represented. They suggest that the division of (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   8 citations  
  37. Reduction: Models of cross-scientific relations and their implications for the psychology-neuroscience interface.Robert McCauley - manuscript
    University Abstract Philosophers have sought to improve upon the logical empiricists’ model of scientific reduction. While opportunities for integration between the cognitive and the neural sciences have increased, most philosophers, appealing to the multiple realizability of mental states and the irreducibility of consciousness, object to psychoneural reduction. New Wave reductionists offer a continuum of comparative goodness of intertheoretic mapping for assessing reductions. Their insistence on a unified view of intertheoretic relations obscures epistemically significant crossscientific relations and engenders dismissive conclusions about (...)
     
    Export citation  
     
    Bookmark   17 citations  
  38. The Cognitive Mechanisms Underlying the Concept of ‫سرعة‬ (Speed) in Arabic.Hicham Lahlou - 2023 - Awej 7 (1):21-32.
    Despite the wide range of studies on how students’ past knowledge influences their understanding of scientific terminology, few studies were conducted to compare non-scientific language with scientific language, or rather everyday language with scientific language, from a cognitive linguistic perspective. The present paper aims to determine the cognitive mechanisms, i.e., image schemas, conceptual metaphor, and conceptual metonymy, which underpin the conceptualisation of the Arabic term سرعة (speed), using a conceptual metaphor theory framework. Thus, the research question guiding this study is: (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  39. Mental mechanisms: Philosophical perspectives on the sciences of cognition and the brain.William P. Bechtel - manuscript
    1. The Naturalistic Turn in Philosophy of Science 2. The Framework of Mechanistic Explanation: Parts, Operations, and Organization 3. Representing and Reasoning About Mechanisms 4. Mental Mechanisms: Mechanisms that Process Information 5. Discovering Mental Mechanisms 6 . Summary.
     
    Export citation  
     
    Bookmark   93 citations  
  40. Phenomenology.Dan Zahavi - manuscript
    In Moran, D. (ed.): Routledge Companion to Twentieth-Century Philosophy. Routledge, 2008.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   34 citations  
  41. Visualization tools, argumentation schemes and expert opinion evidence in law.Douglas Walton - manuscript
     
    Export citation  
     
    Bookmark   3 citations  
  42. Lending a hand: Social regulation of the neural response to threat.Richard J. Davidson, Coan, A. J., Schaefer & S. H. - manuscript
  43.  84
    Mental training affects distribution of limited brain resources.Lutz Antoine, H. A. Slagter, L. L. Greischar, A. D. Francis, S. Nieuwenhuis, J. M. Davis & R. J. Davidson - manuscript
    Direct download  
     
    Export citation  
     
    Bookmark   47 citations  
  44. Future generations: A challenge for moral theory.Gustaf Arrhenius - manuscript
    FD-Diss., Uppsala: University Printers, 2000 (ix+225 pages).
    Direct download  
     
    Export citation  
     
    Bookmark   37 citations  
  45. Natural kinds in biology.Mark Ereshefsky - manuscript
    It is commonly held that objects in the world form natural kinds. Rabbits form a natural kind and so do all pieces of gold. The traditional account of natural kinds asserts that the members of a kind share a common essence. The essence of gold, for example, is its unique atomic structure. That structure occurs in all and only pieces of gold, and it is a property that all pieces of gold must have.
     
    Export citation  
     
    Bookmark   8 citations  
  46. Identifying ethical issues of nanotechnologies.Joachim Schummer - manuscript
    in: Henk ten Have (ed.), Nanotechnology: Science, Ethics and Policy Issues, Paris (UNESCO Series in Ethics of Science and Technology), 2006 (forthcoming).
     
    Export citation  
     
    Bookmark   9 citations  
  47. Political obligation.Richard Dagger - unknown - Stanford Encyclopedia of Philosophy.
    Direct download  
     
    Export citation  
     
    Bookmark   17 citations  
  48. Effects of subliminal priming of self and God on self-attribution of authorship for events.Daniel Wegner, Dijksterhuis, A., Preston, J. & H. Aarts - manuscript
  49.  40
    The Heart of an Image: Quantum Superposition and Entanglement in Visual Perception.Jonito Aerts Arguëlles - 2018 - Foundations of Science 23 (4):757-778.
    We analyse the way in which the principle that ‘the whole is greater than the sum of its parts’ manifests itself with phenomena of visual perception. For this investigation we use insights and techniques coming from quantum cognition, and more specifically we are inspired by the correspondence of this principle with the phenomenon of the conjunction effect in human cognition. We identify entities of meaning within artefacts of visual perception and rely on how such entities are modelled for corpuses of (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  50. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora.David Hall & Christopher D. Manning - unknown
    A significant portion of the world’s text is tagged by readers on social bookmarking websites. Credit attribution is an inherent problem in these corpora because most pages have multiple tags, but the tags do not always apply with equal specificity across the whole document. Solving the credit attribution problem requires associating each word in a document with the most appropriate tags and vice versa. This paper introduces Labeled LDA, a topic model that constrains Latent Dirichlet Allocation by defining a one-to-one (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark   4 citations  
1 — 50 / 1000