LINGUISTIC CODING OF EVIDENTIALITY IN JAPANESE SPOKEN DISCOURSE AND JAPANESE POLITENESS by Nobuko Trent, B.A., M.A. Dissertation Presented to the Faculty of the Graduate School of the University of Texas at Austin in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy The University of Texas at Austin December 1997 Copyright 1997 by Trent, Nobuko All right reserved TABLE OF CONTENTS Chapter 1. Introduction 1 Chapter 2. Theories of linguistic evidentiality 26 Chapter 3. Discourse modality in Japanese 69 Chapter 4. Methodology 114 Chapter 5. Model of Japanese evidentiality 188 Chapter 6. Japanese linguistic politeness and evidentiality 338 Chapter 7. Conclusion 412 Bibliography 432 GRAMMATICAL ABBREVIATIONS ABL ablative case (kara) ACC accusative particle (o) AD HON addresee honorifics AUX auxiliary CAUS causative affix (sase) CNT contrastive (wa ) CONF sentencial particle for confirmation (ne) COMP sentencial complementizer (no, koto, etc.) COND conditional affix (to, tara, eba, nara) CONJ conjecture (daroo, etc.) COP copula (da, desu) DAT dative particle (ni) DES desiderative affix (tai) DIR directional case (e) EMP emphathetic FOR formal (=AD HON) GER gerund affix (te) HON honorific form HYP hypothetical IMP imperative INF infinitive (o, i, ku) INS instrumental particle (de) INJ interjection and hesitation IRR irrealis LOC locative particle (ni, de, e) MODI noun modifier (no) NEG negative morpheme NML nominalizer (no) NOM nominative particle (ga) OBJ object marker (o) = ACC PART sentential particle: VOC, RAPP, CONF, SHAR PASS passive affix PERF perfect affix POSS possessive POT potential affix (re, rare) PROG progressive affix Q question particle (ka) QUOT quotative particle (to) RAPP sentential partical of rapport (ne, wa) REA realis REF HON referent honorifics RES resultative affix (te-aru) STAT stative affix TEMP temporal particle (ni, de) TOP topic particle (wa) VOC vocative sentential particle (yo, zo, ze, sa) VOL volitional affix (yoo) CHAPTER 1: INTRODUCTION When teaching second or foreign language classes, teachers may often note various phenomena of "language transfer" from a student's native language to the target language. "Transfer" may be seen with any aspect of language. For example, if medicine should always be drunk according to a certain language's grammar, it is likely that a native speaker of the language would "lexically" transfer the expression to drink medicine to his second or foreign language. Language transfer can be phonological, semantic, syntactical, or morphological, and is also seen at the discourse level such as in discourse organization and discourse grammar (cf. T. Odlin, 1989). It is presumable that a language learner also "pragmatically" transfers the "viewpoint" (i.e., the way reality is viewed) of his native language or native culture to his target language. Seeing the same reality, people from different cultural or different linguistic backgrounds might perceive reality in different ways or at least encode their perceptions in vastly different ways.1 Thus, even if it is not the case that perceptions differ, the rules of different languages (prescriptive grammar rules and/or pragmatic rules) certainly must have different emphases in expressing the same reality. While teaching Japanese to American students, in addition to grammatical transfer, I have encountered pragmatic transfer which may be due to the cultural differences between Japan and America or 1 due to the differences between the pragmatic use of Japanese language and that of American English, or most likely due to an interplay of both factors. In the translated Japanese conversation (1-1) below, for example, the speaker presented an extremely low-assertive mode of speech in discussing some religious cult members at large who were suspected to be responsible for the Sarin Poison Gas case in the Tokyo metropolitan subway system in 1995, which instantaneously killed or injured hundreds of people. Rising ( . ) and falling ( . ) arrows indicate rising and falling tones in the passage: (1-1) F2: (1) ....that person is, what shall I say, in short, did he make (Sarin gas). Well, he made Sarin gas, and should I say he scattered it by himself. So, is he a scientist . Aren't most of them specialized in that field. So, probably, well, most probably, doing research. University research institutes do not have much funding generally, so after all, it is said that they entered [the cult group] under the condition that they can do whatever scientific research they wanted to do. You know, it is said that "religion" was a quite different thing for those people. So, it is also said that they went into the cult group only because they had desire to study more than they could have done at graduate school. So should we say they are top class scientists. F5: (2) Is that so. F2: (3) It is said so. (The Original Japanese transcription of this passage is in note 2.) In the passage, although speaker F2 was talking about that which 2 is generally believed to be true, her "level of assertiveness" is very low. Her utterances sound very unsure in English translation but in Japanese this type of low-assertive speech is acceptable, or even preferred. The speaker used four major techniques to avoid being assertive: (1) use of structurally indirect sentences such as it is said;3 (2) use of questions and tag-questions; (3) use of lexical items with low commitment such as probably and (4) use of hedges (e.g. you know, well, and what shall I say). In my pilot study of "hearsay" speech in English and Japanese (Trent, 1994), Japanese speakers were observed to keep distance between themselves and the topic of their speech by consistently using structurally indirect sentences such as I heard.., I think.., and it seems.. as well as using question sentences and tag- question sentences that appeared to constantly seek for agreement of the hearers.4 Overall, in comparison with an English speaker's hearsay report, Japanese speech was seen as less assertive, and tends to sound more uncertain. Being low-assertive may be accepted as modest and well behaved in Japanese culture, however, this may not always result in being perceived favorably in intercultural communication: the overuse of less assertive speech may be considered "evasive", "irresponsible", "ambiguous", or "dubious" in the norm of other language environments. People may well consider that the less assertive tendency of Japanese speech is simply a "cultural" phenomenon. Language and 3 culture are said to be "interwoven" and there is a view that language structure possibly influences our thought (e.g. Sapir,1929; Whorf, 1956). In this study, I will assume that Japanese indirect and low-assertive speech is primarily a "linguistic" phenomenon, which can be systematically explained through a theory of pragmatics. As a native speaker of Japanese, I intuitively feel the existence of "rules" which tell us how to be appropriately less assertive and indirect in interpersonal communication if we want to be a socially competent person in each speech situation. As Clancy (1986) wrote that "Japanese rely upon indirection in many common social situations especially when they are trying to be polite" (p. 215), the factor that motivates pragmatic rules here is politeness which eventually leads us to the cultural aspect of the Japanese language. The rules for less assertiveness are not so-called a "context-independent grammar", but rather are the rules for "performance" (i.e., "context-dependent interpretation" by Levinson, 1992). Hence, this dissertation is a study of Japanese pragmatics, in particular, a study of less assertiveness in interpersonal communication in the Japanese language. This study investigates the relationship between the language and context that is encoded in the structure of language, and eventually the rules are examined in relation with linguistic politeness behavior in the Japanese cultural environment. SCOPE OF THE STUDY 4 There are certainly numerous ways to be indirect in communication. Theories of pragmatics--speech act and politeness theories, in particular--provide us with insightful thoughts on this issue (cf. Lyons, 1983, Searle, 1975). This study specifically attempts to explore Japanese pragmatic rules which result in less assertive communication through the "evidentiality" concept, which is encoded in the language structure.5 What, then, is evidentiality? Under his "maxim of quality" for conversational principles, i.e., "Try to make your contribution one that is true", Grice (1967, first published 1975) assumed two submaxims: (1) Do not say that which you believe to be false; and (2) Do not say that for which you lack adequate evidence (p. 46). Although conformance to these maxims is expected among rational adult speakers, one does not always have solid evidence for what one says; therefore, when a given utterance is not supported by "adequate" evidence, the speaker usually express low-commitment to his proposition in different ways. The study of evidentiality is concerned with how this is done. Evidentiality is generally defined as "the linguistic means of indicating how the speaker obtained the information on which he bases an assertion" (Willet, 1988:55).6 Chafe (1986) viewed evidentiality in a broader way so as to cover "any linguistic expression of attitude toward knowledge" (p. 271). If an individual has direct evidence (e.g. witnessing) on which his assertion is based, he will use direct language forms, while he may speak rather 5 indirectly when his assertion is based on, for instance, folklore. The types of evidence that human beings have (e.g. "attested", "reported", and "inferred") must be universal; however, how to express the difference such as the difference in evidence types, and the difference in "degree of certainty" must vary across languages. Based on these thoughts, I believe that evidentiality marking can be a useful concept to apply in Japanese indirect, less assertive communication. If Japanese speakers' language behavior is overly indirect from the universal standard concept of evidentiality, there must be reasons behind the Japanese behavior, and this behavior may be systematic enough to form a pragmatic rule. Evidentiality markings can be seen everywhere; English, for example, is said to be abundant with evidentials (c.f. Chafe, 1986). There seems to be two ways to view evidentials. One way is through their grammatical categories; English evidentials are expressed with modal auxiliaries (e.g. may, must, might, and can), adverbs (e.g. probably, certainly, definitely, likely, and possibly), and miscellaneous idiomatic phrases (e.g. it looks like, it sounds, and it feel like). The other way to see evidentials is through their function types such as "reliability", "induction", "deduction", "hearsay", and "sensory". I quote some examples below of the functions of English evidentials from Chafe (1986): [1-2] -Evidentials which indicate "DEGREES OF RELIABILITY" 6 (a) We kept thinking maybe they'd be stationed at the Presidio. -Evidentials which indicate "INDUCTION" (b) It must have been a kid. -SENSORY evidentials (c) I see/hear her coming down the hall. -Evidentials which express "HEARSAY" (d) They were using more verbs than English speaking kids have been said to learn. -Evidentials which indicate "DEDUCTION" (e) He or she should take longer to respond following exposure to inconsistent information than when exposed to no information at all. (f) Adults presumably are capable of purely logical thought. (264-269) In addition to the examples above, Chafe extended the scope of evidentials and listed "hedges" and "expectation" as other types of evidentiality functions. Certainly, this list must neither exhaustive, nor functionally appropriate cross-linguistically. Although there is not yet a substantial study specifically on Japanese evidentiality, 7 some thoughts on the issue have appeared in limited ways in the studies of "modality" of sentence (e.g. Nida and Masuoka, 1989). The "modality" or "mood" of sentences is another fairly un-articulated area in linguistics. Lyon's definition of modality as the "opinion or attitude of the speaker" (1977:452) seems to be widely accepted. What, however, does "opinion and attitude of the speaker" actually mean? Fillmore (1968:23) proposed that any sentence has two 7 main constituents: "proposition" as the basic constituent, and "modality" (negation, tense, mood, and aspect, etc.). Therefore, logically, all sentences have some kind of modality, and the evidentiality factor is part of it. In this dissertation, evidentiality is primarily investigated in relation with sentential modality. Generally, Japanese sentences mark modality explicitly at least at the end of the sentence. This is due to the Japanese SOV sentence structure (i.e., S ubject + O bject + V erb sequence), which places the verbal element at the very end of a sentence. Of course, Japanese has other ways to express a speaker's mood such as adverbs, deixis, and idiomatic phrases as English does, but to cover all evidentiality phenomena would make the scope of this study too broad. Thus, the main objective of this research is to examine evidentiality in terms of the sentence-final modality. The purpose of this dissertation then is fairly straightforward: to examine the interpersonal communication of Japanese speakers, seeking to provide a theoretical construction of Japanese pragmatic rules, evidentiality rules in particular, which result in the standard speakers' preference of less assertive and indirect forms of the language. The next chapter briefly overviews existing linguistic theories on evidentiality in general as well as work focusing specifically on Japanese, particularly in relation with sentence modality. Chapter three discusses the lack of assertiveness in Japanese from the perspective of evidentiality. As noted earlier, there has not yet 8 been significant study of evidentiality in Japanese; the concept of evidentiality itself has not yet been paid sufficient attention to. One insightful ideal construct which may have some relation with the issue was proposed by Kamio (1979, 1985, 1987, 1990, 1994) in his theory of information territory of the conversationalists. Kamio argues that a speaker chooses different sentence-ending modalities to indicate the "territory" that he considers the information to belong to: the topic can be in the territory of the speaker if it is, for example, about his dinner plans; it can be in the territory of the hearer if it is a question about the hearer's health; or it can be shared by both speakers' territories if it is about a mutual acquaintance. The theory sees the "distance" between the topic and the conversationalist from the viewpoint of an information territory that each speaker has. Although Kamio did not emphasize the question of evidentiality, the theory is fundamentally related to the issue of the concept of evidentiality in that both concepts deal with how a speaker linguistically expresses the degree of psychological distance which he feels between himself and the topic. Chapter three also explores the issue of Japanese low assertiveness from the viewpoint of discourse management. Considering Kamio's theory and the concept of evidentiality raises the possibility that the Japanese concept of evidentiality involves not only the distance between the speaker and the topic, but also the distance between the hearer and the topic. From this perspective, it follows that 9 the Japanese evidentiality system is very hearer-sensitive. Takubo (1990, 1992) and Takubo & Kinsui (1990) argue that a speaker continually monitors the hearer's knowledge of the ongoing topic and selects appropriate linguistic forms to show this understanding. They analyzed Japanese deixis and some sentence-ending forms from this perspective. I found Takubo and Kinsui's perspective to be useful for the pragmatic conceptualization of evidential markings in that the distance between the topic and the participants is the key issue in Takubo and Kinsui's theory. They used the metaphorical idea of "memory storage" in the human brain: information the speaker stores in his direct memory and information which the speaker assumes his hearer has that is stored in the speaker's indirect memory are always referred to by the speaker to manage the discourse. As I understand it, the idea of "direct/indirect memory storage of the participants" of Takubo and Kinsui is relevant to the concept of "distance between the topic and the participants". Chapter four explains the nature of the data on which this study is based and discusses the method of analysis. Discourse data of natural speech was collected from a variety of speech situations to which approximately sixty people from diverse age-groups contributed. Since the final goal of this research is to relate the Japanese system of evidentiality marking to the Japanese concept of linguistic politeness, in the analysis, which is both qualitative and quantitative, the degree of formality of speech settings is considered to be the main variable which 10 decides the speaker's choice of evidentiality markings. Other variables include the speaker's demographic data, the propositional content of the utterance, and the sentence-ending evidential form used for the utterance. The relationships between these variables are analyzed from the perspective of evidentiality. A custom database was developed and used in order to facilitate quantitative analysis. Chapter five proposes a model of the Japanese evidentiality system based on the data and analysis from the preceding chapters. It is demonstrated that the Japanese system of evidentiality marking can be systematically explained by the concept of Japanese speaker's awareness of the information territories of the participants: a speaker is aware of the socially acknowledged "owner" of a topic, being particularly sensitive to his hearer's knowledge, and linguistically expresses his awareness of status of information. In doing so, a speaker may intentionally overextend his hearers' information territory so as to include the speaker's own information territory. In this way, the speaker linguistically pretends that participants share his information. This pretention makes his speech less assertive in that the speaker asks for his hearers' agreement continually during his speech. At the same time, a speaker may also be cautious and make his information territory appear smaller than it actually is by exaggerating the distance between the topic and himself. The speaker may do so by making his speech structurally indirect. In this sense, how the distance between the topic and the communication participants is expressed from different 11 perspectives is the core in Japanese evidentiality marking. The speaker's emphasis on the distant relationship between himself and the topic and emphasis on closeness between his hearers and his topic seem to be motivated by the speaker's desire to be polite in interpersonal communication. Actually "be indirect" and "show sharedness of information" are two of a variety of traditional politeness strategies. Politeness factors and rules such as "higher formality" (Fraser, 1990), "keep aloof" (e.g. Lakoff 1973a), "don't impose" (e.g. Fraser 1990, Brown and Levinson, 1978), more or less, suggest indirectness. Strategies such as "show camaraderie" (Lakoff), and "include both speaker and hearer in the activity" (Brown and Levinson) may be in line with the "show sharedness" strategy. In Japanese, the use of evidentiality expressions seems to be a useful linguistic strategy for being polite. Chapter six then demonstrates how the Japanese evidentiality system is related to Japanese politeness. It is argued that the observation of the system proposed in chapter five is pragmatically required in the community in the same way that situationally appropriate use of honorifics and formal forms are required. To discuss politeness in the Japanese language inevitably involves the issue of the relationship between language and culture. There have been some studies on Japanese politeness in areas such as honorifics (e.g. Hori,1986; Hijikata, et. al. 1986) and women's language (e.g. Ide and McGloin, 1991; Wetzel, 1988) that delved into the issue of Japanese culture; however, there is as yet no fully conceptualized 12 theory of Japanese politeness as a whole. Brown and Levinson's "face wants" framework, which has been probably most influential, views politeness in terms of sets of strategies on the part of discourse participants for mitigating potentially threatening speech acts. Their account sees language use as shaped by the intention of individuals. In contrast with Brown and Levinson, the "social norm" view by Japanese researchers (e.g. Hill et al., 1986), argues that politeness is a set of behavior patterns preprogrammed as a social norm by those possessing power, such as educators. The social norm view may be useful for Japanese culture in that this view sees politeness as having a social function. Bourdieu (1977) claims that "concessions of politeness are always political concession...practical mastery of what are called the rules of politeness, and in particular the art of adjusting each of the available formulae...to the different classes of possible addressees, presupposing the implicit mastery, hence the recognition, of a set of opposition constituting the implicit axiomatics of a determinate political order" (p.95, p.218 cited by Fairclough, 1992). Referring to Bourdieu, Fairclough (1992) suggests that to investigate politeness conventions is to gain insight into social power relationship. I think Bourdieu and Fairclough's view provides the foundation of the social norm view of politeness. However, the strategic view of politeness by Brown and Levinson should not be dismissed from the evidentiality-based viewpoint of Japanese politeness. Conformance to evidentiality rules are almost always socially preferred but their use 13 can also be strategic. This topic is expanded upon in chapter six. Then what in Japanese culture has formed and maintained the Japanese politeness concept among the people? This question is explored in relation to the concept of "territory" in the following chapter seven. The insightful concept of "high context" culture versus "low context" culture which was originated by Hall (1976) and pursued by his followers (e.g. Ting-Toomey, 1985; Cohen, 1987), seems to be useful in understanding Japanese culture as contrasted with the Western cultures. Although researchers have presented a variety of distinctive differences between the two, in short, high-context cultural behavior is described as indirect, allusive, group-oriented, and shame-oriented. Japanese culture is described as being entirely high-context.8 On the other hand, Western cultures such as the American culture are termed as low-context and are characterized as direct, individualistic, and guilt- oriented. These differences may be seen covertly or overtly in all aspects of human life including systems of law, trials, politics, and education. Language behavior, in particular, may present one of the most crucial distinctions between high- and low-context cultures. So, the Japanese evidentiality system that makes utterances less assertive and culturally acceptable in that way may be attributed to the high- context Japanese culture, which may be sensitive to the distinction between outsiders and insiders (i.e., group territory). Concluding this study, chapter seven discuss this cultural issue behind the Japanese 14 evidentiality system and linguistic politeness. 15 CHAPTER 1: NOTES In this dissertation, quoted conversational samples are written in the format "(x-y)", where x is the chapter number and y is a sequence number. For example, (1-2) refers to the second sample in the first chapter. Charts, tables, and figures are written in the same fashion with the exception that they use square brackets rather than parenthesis. For example, [1-3] refers to the third sample, a chart, table, or figure, in the first chapter. 1Although a discussion of the relationship between "language" and "thought" is not the topic of this dissertation, it is certainly related with this study since how Japanese speakers develop their concept of evidentiality must depend on a given cognitive environment, Japanese culture. The language-thought issue is often referred to in children's cognitive development; naturally we all underwent the process of building up our cognitive system when speaking our native language(s). Young children rapidly acquire their native language while organizing their experiences into concepts. How do they accomplish these two critical cognitive tasks? Do the linguistic patterns influence how they view reality? Whorf and Sapir, Piaget, Chomsky, and Vygotsky are major theorists in the classic works on perspectives on language and thought. A brief explanation of their theories follows. The Linguistic Relativity Hypothesis advocates that the structure of the language one speaks affects one's perception of the world in a way that would be different if one happened to speak another language instead. The boldest presentation of this notion was B. L. Whorf (e.g. 1956), and is known as the Sapir-Whorf hypothesis. Whorf saw thinking as largely a matter of language, inescapably bound up with systems of linguistic expression: the structure of the language one uses influences the way in which one understands one's environment. 16 Therefore, according to his theory, the picture of the universe differs from one language to another. This notion of language determinism has been criticized as being too strong, and scholars (e.g. Lenneberg, 1967) criticized this notion for lack of evidence, but a weak version of the Whorfian hypothesis, which says that lexical items and linguistic structures that a language provides can have an important influence on thought, seems to be more acceptable. Piaget (e.g. 1968) demonstrated his insight about language- thought relationship in his theory of developmental sequence of stages in human cognitive development. In the Piagetian theory of "cognitive determinism", children learn about the world first, build a cognitive structure, then map language information on to the cognitive structure. Therefore, in Piaget's theory, language does not cause or affect a child's cognitive development. Chomsky (e.g. 1975, 1980, 1988) proposed the concept of a "language acquisition device", an inborn human mechanism to acquire language (syntax, in particular). His assertion is based on three assumptions. First, grammars are creative generative rules that enable a speaker to produce an infinite number of sentences which he has never heard. Second, a child's linguistic environment is too "impoverished" to provide a child with a "perfect" model of language use, in that adult speakers make errors and use incomplete sentences or indirect expressions. So it does not seem that children could deduce the structure of language from the finite and imperfect sentences which they hear. Third, despite this unfavorable environment, the process of language acquisition is fairly uniform across languages. These assumptions illustrate the miraculous nature of language development. So Chomsky concluded there must be a highly abstract innate structure that constrains language acquisition (particularly syntax). Therefore, the human biological aspect is more emphasized in Chomsky's theory of language development than environmental factors such as culture. Vygotsky's "interactionist" approach assumes that higher level 17 thought processes are derived from social interaction (e.g. Vygotsky, 1962). Vygotsky advocated that language plays an important role in human cognitive development although both language and cognition begin as independent processes (by the age two), but soon this prelinguistic thought interacts with language, and thought is gradually transformed by it. Once a child establishes the connection between his experience and language, development in each will influence the other. This is why Vigotsky was particularly concerned with the field of education, particularly literacy and child development. 2 Original Japanese utterances of discourse (1-1) (1) aa, Well soo. so Ano that hito person ga NOM ichiban nante iu no, most what shall I say yoosuruniin short tsukutta made . (2) sarin o tsukutte yoosuruni jibun de maita -tte Sarin OBJ make(te-form) in short self INS scattered QUOT iu ka... say wonder (3) yoosuruni kagakusha . in short scientist (4) hotondo ga daigaku no toki ni sooiu bunya o most NOM univ. MODI time TEMP such field OBJ senmon to shite yatteta hito-tachi . major make(te) did(GER) people (5) dakara tabun tabun-tte iu ka yoosuruni kenkyuu . therefore probably probably-QUOT wonder in short research (6) daigaku no kenkyuujo-tte shikin ga amari nai univ. POSS research center-QUOT fund NOM much NEG kara kekkyoku jibun ga ima yatteru-no o so eventually self NOM now doing-NML OBJ 18 nandemo sukinayooni tsukur-asete ageru-tte iu whatever as pleased make-CAUS give-COMP jyooken de yappari soo-iu-no ga haitta riyuu condition INS as expected so-COMP-NML NOM entered reasons ga soo-iu-no mo aru-n-janai-ka to wa NOM so-called-NML also exist-n-NEG-COMP QUOT CONT iwa-reteru kedo ne. say-PASS but RAPP (7) dakara kenkyuu shitakute daigaku de wa dagakuinn therefore research want (te-form) univ. LOC CONT grad.school toka de benkyoositeru ijyoo-ni motto benkyoo shitai-tte iu such as INST studying more than more study want-QUOT ishi to iu no toka mo atte itta-n-janai ka to desire COMPcalled NM etc also exist(te-from) went-n-NEG QUOT mo iwa-rete-iru no ne. also sa -PASS STAT VOC RAPP (8) dakara moo toppu reberu no kagakusha-tte iu-ka... therefore EMP top level MODI scientist -COMP I wonder 3"Indirect speech" in this research is different from "indirect illocutionary acts" (Searle, 1975). According to Searle, an illocutionary act can be produced indirectly when the syntactic form of the utterance does not meet the illocutionary force of the utterance. For example, the syntactic form of the utterance could you keep quiet? is yes/no interrogative while its illocutionary force is actually "directive" (i.e., be quiet). On the other hand, a "direct illocutionary act" is issued when the syntactic form of the utterance matches the illocutionary force of the utterance. For example, the utterance you are fired is syntactically "declarative" and its illocutionary force is "declaration" . Indirect speech in this dissertation simply means structurally (syntactically and morphologically, in particular) indirect speech 19 which is often expressed by complex sentence structure (in case of English) in that the matrix verb-phrase has some modality of indirectness. The utterance it looks like he is failing the course is indirect in terms of assertiveness as well as evidentiality as opposed to the direct the statement he is failing the course. Questioning forms are also indirect in terms of the speaker's degree of assertiveness, and tag- question sentences are also structurally less assertive. 4In my paper about hearsay speech (Trent, 1974), I pointed out two possible causes of the Japanese preference of indirect sentences. One is the speaker's concept of speech territory; hearsay does not belong to the speaker's information territory so the speaker ought to express distance between the information and himself through indirect sentence forms. The other factor is simply syntactical. Japanese sentences have an SOV structure in that a verbal constituent always comes at the sentence ending. I assumed that with an SVO sentence structure, as in English, a speaker is not necessarily required to repeat the same verb phrase of hearsay ("I heard", for example) to tell a hearsay story; if he is telling five sentences of hearsay, the first I heard phrase may possibly cover the whole discourse. However with an SOV sentence structure, if a speaker tries to minimize the use of I heard phrase, he needs to put I heard at the very end of the whole discourse. This is not acceptable because, in this way, there is no way for the hearer to know the speech is about hearsay before the very end of the discourse. Therefore, SOV language speakers may tend to repeat the V (I heard ) at the end of every sentence. I found that many Japanese speakers preferred making hearsay sentences incomplete and connect them by using te-form of verbs at each sentence. By doing so, a speaker is able to make the whole discourse sound as if it is an extremely long single sentence ("te-linkage"), and the speaker simply puts verb- phrases which indicate that the information is hearsay (e.g. I heard, it 20 seems, I think) at the very end or at the beginning of the discourse. So, this is, in a sense, a Japanese speaker's strategy to avoid the inconvenience of an SOV sentence structure when one has to repeat the same verb phrase. This is my hypothesis, and I have not investigated with other SOV language speakers' behavior. An example of te-linkage is shown below (English translation of the discourse immediately follows): (1-3) (1) M3:Maikeru Jakuson ga jyuusansai no otokonoko o Michael Jackson NOM 13 years-old MODI boy OBJ tsurekonde bring in (te) (te-incomplete) (2) nani shita-n-kana? nani shitatte, nanka seishikini wa what did-n-Q what do(te) somewhat officially CONT happyo saretenai kedo chairudo molesuteision . announce(PAS) (NEG) but child molestation (Noun-ending) . (3) sono otokonoko ga beddo de konna koto o that boy NOM bed LOC like this matter OBJ sareta toka itte, did(PASS) such say (te) (te-incomplete) (4) uttae o motteitte, claim OBJ bring (te) (te-incomplete) (5) moo sorosoro keijisaiban ni narookana-tte iu chokuzen yet shortly criminal trial become COMP just before de wakai ga seiritsu site TEMP conciliation NOM establish (te) (te-incomplete) (6) de, okane, wan milion ka tuu milion ka moratte then money one million or two million or receive (te) (te-incomplete) (7) fairu wa nakatta koto ni sita kedomo filing SUBJ happened(NEG) COMP made but... 21 (8) demo dakara sono ko kara no uttae wa but because that boy from MODI charge TOP nakatta kedo ima keisatu gawa ga nannka happened(NEG) but now the police side NOM somewhat kenji-gawa toshite sore o saiban ni motte-iku procecuters'side as that OBJ trial TEMP bring(te) go toka dounokouno yattoru to omou such such and such doing QUOT think. (Indirect) . . (9) Int.:By the way, do you know something about the relationship between Michael Jackson and Elizabeth Taylor? (10)M3:iya nanka, naka ga ii kedo.... well somewhat relationship NOM good but.... (11) nannka Maikeru Jakuson ga sono saibanzata ni somewhat Michael Jackson NOM that trial matter nari hajimete tuaa o ichinichi futuka become start(te) tour OBJ one day two days yooroppa de yatte, de nokori canserusite Europe LOC did (te) then the rest cancel (te) (te-incomplete) (12) amerika ni kaetta kana tte ittotta kedo jituwa America LOC returned Q COMP said but as a mater of fact kaette nakute return(te) (NEG) happen (te) (te-incomplete) (13) Erizabesu Teilaa no uti ni chotto maa otte Elizabeth Tayler POSS house LOC shortwhile stay (te) (te-incomplete) (14) aa, jituwa koko ni ottandesuyo-tte nanka Oh, as a matter of fact here LOC stayed QUOT somewhat ni shuukan go gurai ni hyokotto kaettekita two weeks after about TEMP unexpectedly returned 22 to iu..... QUOT said. (indirect) (English Translation) (1) A: Michael Jackson brought a 13 year-old boy in, (TE-ending) (2) What did they do? That is not officially announced so I don't know well, but child molestation... (noun ending) . . (3) That boy said Michael Jackson did this and that to hibed, (TE-ending) m in (4) [The boy] sued, (TE-ending) (5) When the case was about to reach the criminal court, conciliation was made, (TE-ending) (6) Then, he got the money, one or two million, (TE-ending) (7) Then, nothing was filed, (TE-ending) ) (8) But, even though there was no charge from that boy, now, the police are trying to bring the case to court being the prosecution, they are doing that sort of thing or another, I think, (indirect) . . (9) Int.:By the way, do you know anything about the relationship between Michael Jackson and Elizabeth Taylor? 23 (10) A:Well, they are somewhat on good friendly terms. (direct) (11) When the case [above] was beginning to be serious, he canceled his European tour after two or three days, (te-ending) (12) They were saying that he returned to America but actually he did not return home, (te-ending) (13) But stayed at Elizabeth Taylor's house for a while, (te-ending) (14) Then after about two weeks, he came home saying he was at Taylor's, it is said like that. (indirect) The speaker intentionally avoided completing each sentence to connect each to the last indirectness marker I think (8), and also it is said in (14). In a sense, he planned his discourse ahead to evade saying I hear, I think in each sentence ending. The speech sounds fairly informal due to the repeated use of incomplete sentences. I feel this is good evidence that basic Japanese syntax influences our hearsay discourse. 5Some Japanese evidentiality expressions (e.g. expressions of sensation) appear to be grammaticalized; thus, it is difficult to say if the proper use of these expressions is part of sentence grammar or a pragmatic requirement. Defining "pragmatics", Katz (1977), Kempson (1975) and others agreed that grammar and pragmatics are different concepts: Grammars are theories about the structures of sentences while pragmatic theories do nothing to explicate the structure of linguistic construction of grammatical properties and relations... 24 They explicate the reasoning of speakers and hearers in working out the correlation in a context of a sentence token with a proposition (Katz, 1977:19 quoted by Levinson, 1983:8). However, on the relationship of pragmatics and grammar, I agree with Levinson (1982) in that pragmatics and grammar cannot be separated since sometimes aspects of linguistic structure directly encode the features of the context ("context-dependent grammar"). In Japanese grammar, the use of giving and receiving verbs is an example of context-dependant grammar. For example, there are five verbs meaning to give: ageru, kudasaru, kureru, sashiageru, and yaru. The correct use of these giving verbs requires an analysis of the semantic roles of AGENT, GOAL, and OBJECT based on "semantic scenes" (cf. Wetzel, 1984). In short, ageru (and the honorific sashiageru) is used when giving to an out-group target, kureru is used when giving to an in-group target, and yaru is used when giving to a lower-status target. This grammar requires a speaker to analyze the context of a particular act of giving. 6 The definition of "evidentiality" varies among scholars. The main reason for this is that evidentiality marking is often interwoven with other concepts of grammar such as mood and modality particularly in terms of epistemology. Details are discussed in the next chapter. 7 Aoki (1986) is the only study known to focus specifically on Japanese evidentiality. It is a short overview of evidential-like aspects in Japanese grammar. The study lists evidential-like expressions in three areas: descriptions of sensation, hearsay markers, and no- marking which allows a speaker to assert a statement as a fact even if direct evidence is not available (cf. chapter two). 8In Hall's definition, "context" is "what one pays attention to". He 25 explained that culture functions as a selective screen of our information in-taking; culture designates what we pay attention to and what we ignore. In high-context culture, awareness of the selective process is high whereas in low-context cultures people's awareness of that is low. The process of screening is called "contexting". Hall defined cultures such as those of the American Indians, in which people are deeply involved in each other, to be high-context cultures, and defined individualistic cultures --such as those of the Swiss and the German--in which there is relatively little involvement with people to be low- context cultures. (1989: 39-40) 26 CHAPTER 2: THEORIES OF LINGUISTIC EVIDENTIALITY WESTERN THEORIES OF MODALITY AND EVIDENTIALITY The study of evidentiality as a linguistic topic has a long history starting with Greek and Platonic tradition and prevails to this day in philosophy. It has become a linguistic issue in dealing with sentential modalities. The word 'modality' in the English language finds its root in the Latin modus (manners). Although there are perspectives that do not acknowledge modality as an independent grammatical category as "tense" or "aspect" is acknowledged to be, the fundamental premise of this dissertation is that both modality and evidentiality are grammatical phenomena, and both categories are treated in that way. As a matter of fact, in traditional English grammar, modal auxiliaries such as may, can, must, shall and certain verbal endings have been considered a category that presents the mood of the sentence. Earlier in this century, logician von Wright (1951) proposed four groups of modals: alethic modes (modes of truth); epistemic modes (modes of knowing); deontic modes (modes of obligation); and existential modes. He claimed that the modal concept as a whole is concerned with the concept of "necessity and possibility". In modern times, in linguistics, a new viewpoint regarding modals was proposed. Linguists (e.g. Fillmore, 1968; Lyons, 1977) assumed that a sentence is constructed with two basic components: a propositional element (the core part of the sentence) and a modal element (e.g. tense, aspect, and mood). As previously noted, Lyons 27 defined modality as the "opinion or attitude of the speaker" (1977:452) toward the proposition as expressed by himself. Evidentials are defined by Chafe (1986) in the "broad sense" as marking epistemology, coding the speaker's attitude toward his knowledge of a situation, and in the "narrow sense" as marking the source of knowledge (1986:262).1 In proposing two dimensions of evidentials, Chafe suggested that evidentiality is nearly equivalent with modality. Certainly, in general opinion, evidentiality as a semantic domain is considered primarily modal. The notion of modal or modality is less clearly defined, but it is commonly agreed that evidential distinctions are a subset of "epistemic modality" marking (e.g. Lyons 1977, Bybee 1985, Palmer 1986). In epistemic modality, the notions of evidentiality, i.e., necessity and possibility, are viewed with respect to a speaker's knowledge and belief upon which he bases his judgement of the necessity/possibility that the proposition is true. The following chart [2-1] indicates a summary of the existing views about the position of sentence evidentiality in the category of sentence modality (e.g. Lyons, 1977; Palmer, 1986; Bybee, 1985). One linguistic view regards evidentiality as the synonym of epistemic modality (e.g. Willet, 1988). In the other view, evidentiality is narrowly defined as being a part of epistemic modality which concerns with source of information as shown in [2-1] (e.g. Palmer, 1986). 28 [2-1] Epistemic modality and evidentiality Evidentiality (Concerned with source of information. (e.g. hearsay, report, senses.) Epistemic modality (Truth-oriented, concerned . @with matters of belief, knowledge, opinions, etc. A speaker qualifies his commitment to the truth of his proposition.) Judgement of necessity and possibility (e.g. speaker's speculation, deduction) Modality (Speaker's opinions and attitude to his proposition) Deontic modality (Agent-oriented, concerned with the necessity or possibility of act performed by a morally responsible agent. (ex. John may come. (Permission-deontic possibility) John must come. (Obligation-deontic necessity) Epistemic modality was defined by Palmer (1986) as "showing the status of the speaker's understanding or knowledge; this clearly includes both his own judgement and the kind of warrant he has for 29 what he says" (p.51). Palmer meant that there are two systems of epistemic modality: one is the speaker's judgement of necessity or possibility, and the other is evidentiality. Palmer also indicated how these two systems work differently from one language to another. He cited English as an example of a language with grammaticalized epistemic judgement, and German and others as languages that appear to combine the two in a system of grammatical marking. Palmer, although having defined evidentiality as being different from epistemic judgement, in analyzing various languages, often involved judgement type epistemic modalities such as "deductive", "speculative" and "assumptive" (his terms) in the scope of evidentiality. Indeed, whether we should separate "pure" evidentials (i.e., source of information) from epistemic judgement (i.e., statements of necessity and possibility) seems to be a persistent problem because a speaker's judgment is based on his qualification of evidence. Chung and Timberlake (1985) claim a different framework for mood, which combines epistemic judgement and epistemic evidentiality (in Palmer's terms) together in one category. In doing so, their main attention was on the contrast between a realis and an irrealis world: Mood characterizes the actuality of an event by comparing the event world(s) to a reference world, termed the actual world. An event can simply be actual (more precisely, the event world is identical to the actual world); an event can be hypothetically possible (the event world is not identical to the actual world); the event may be imposed by the speaker on the addressee; and so on. Whereas there is basically one way for an event to be actual, there are numerous ways that an event can be less than completely actual. For this reason our discussion of mood is 30 concerned principally with different types of non-actuality. It is also clear, however, that languages differ significantly as to which events are evaluated as actual (and expressed morphologically by the realis mode) vs. non-actual (and expressed morphologically by their irrealis mood). (1985:241) It must be true that the ways to show realis/irrealis are certainly diverse among languages; some languages may have grammaticalized rules to mark realis and irrealis, some language may have only pragmatic rules, and some may be dependent on each speaker's subjective decision. Chung and Timberlake posit three types of mode: "epistemic mode", "epistemological mode"; and "deontic mode". The difference between their "epistemic mode" and "epistemological mode" is firmly within the scope of this dissertation. They characterize each mode as follows: The epistemic mode characterizes the event with respect to the actual world and its possible alternatives. If the event belongs to the actual world, it is actual; if it belongs to some possible alternative world (although not necessarily to the actual world) it is possible; and so on. Two subtypes of epistemic mode are often distinguished: necessity (the event belongs to all alternative worlds) and possibility (the event belongs to at least one alternative world). These subtypes are illustrated by one sense of the English modal auxiliaries; consider John must be in Phoenix by now ( = in all alternative worlds that one could imagine at this time, John is in Phoenix) and John can/may be in Phoenix now ( = there is at least one world one could imagine in which John is in Phoenix). (1985:242) Given that the epistemic mode characterizes the actuality of an event per se, it does not include a participant target or strictly speaking, a source. The epistemic mode can be contrasted with a related mode, 31 the epistemological mode, which differs only in that it more clearly involves a source. The epistemological mode evaluates the actuality of an event with respect to a source. The event may be asserted to be actual, or else its actuality may be dependent on the source in one of several ways. (1985:244) Chung and Timberlake claimed to have discovered, in their survey of the essentials of tense, aspect and modal in Lakhota, Takelma, German, and others, that a speaker uses the epistemic mode and the epistemological mode differently. As quoted above, they define epistemic mode as the mode that characterizes the situation the speaker is describing with respect to both the actual world and another possible, non-actual world (i.e. the world of necessity vs. the world of possibility) and epistemological mode as the mode that is used to evaluate the actuality of the situation with respect to the speaker's source of information. Therefore, Chung and Timberlake's "epistemological mode" theoretically involves both "evidentiality" and "judgement of necessity and possibility" which are separated in the traditional view [21]. Within this "epistemological mode" they proposed four parameters: [2-2] Parameters of epistemological modes proposed by Chung and Timberlake (1985:244) (a) "EXPERIENTIAL" , in which the event is characterized as witnessed or otherwise experienced by the "source" (i.e the speaker);2 (b) "INFERENTIAL" , or "EVIDENTIAL", in which the event is characterized as inferred by the speaker from evidence; (c) "QUOTATIVE", in which the event is reported from another source, told to the speaker by someone else; and 32 (d) "CONSTRUCT", the submode in which the event is a speaker's construct (thought, belief, fantasy) of the source. Parameter (a) is direct evidence, parameter (b) is "judgement of necessity and possibility" in the traditional sense, and parameter (c) is so-called evidentiality in a traditional narrow sense. Parameter (d) is, perhaps, speaker's "judgement", but which is more subjective than (b). These four parameters are similar to those of Chafe (1986) presented in chapter one. At a glance, the distinction between epistemic mode and epistemological mode in Chung and Timberlake's term is not so straightforward as they claim it to be. On this point, Chung and Timberlake state that some languages may "use the same morphology to encode the epistemic and epistemological modes, suggesting that these modes are concerned with similar types of non-actuality" although "a language may express epistemologically uncertain events with morphology used basically for epistemic non-actuality" (p.245); therefore, the distinction may not be applicable for some languages. Examples of both "epistemic mode" and "epistemological mode" from Chung and Timberlake's framework may help one understand their distinction between the two modes. The following example from Takelma is in distinct realis mode with a verb of distinct realis: (2-3)Mena yapfa tfomo-kfwa bear man kill(REALIS)-3HUMAN OBJ (The bear killed the man) 33 Chung and Timberlake claim that the next Takelma sentence is in the inferential epistemological mode, with a different stem used for all non- actual moods and a special inferential suffix -kt: (2-4) Mena yapfa tdomo-kfwa -kt bear man kill(IRR)-3HUMAN OBJ-INFERENTIAL (It seems that the bear killed the man/The bear must have, evidently has, killed the man.---Epistemological) The irrealis mode of the above sentence (2-4) (i.e., highly possible world) is grammatically contrastive with the "actual world" but, at the same time, the mode of inference (based on some evidence obviously) is grammatically presented. The next example is from Lakhota (Boas and Deloria, 1941 also quoted by Chang and Timberlake) where a verb suffix form tkha is analyzed to be used for a "counterfactual but hypothetically possible event"; therefore, the case represents the epistemic mode of Chung and Timberlake: (2-5)Lehayela me-t?a tkha now I(SG)-die HYP (I could have/almost died. -Epistemic) In Lakhota, a simple sentence without any evidential basis to support it can only use the realis mode. Willett (1988) argues that all four parameters (a) to (d) in [2-2] proposed by Chung and Timberlake are "evidential-like" and maintains that he found the same parameters in the languages he examined. Willet concludes that inference is "best treated as a third major type of evidential, on a par with sensory and reported evidence", and that these three "form a set of epistemic 34 distinctions that contrast semantically with those of confidence (i.e., judgement)" (1988:54). Thus he defines evidentiality as "the linguistic means of indicating how the speaker obtained the information on which he bases an assertion (and reliability of a speaker's knowledge)", which I have adopted as a general definition in this dissertation. Therefore, the scope of evidentiality in this study is approximately the same as the phenomena termed "epistemic modality" in the traditional sense, "epistemological mode" characterized by Chung and Timberlake, and "evidentiality" by Willet. The meanings of different types of evidentials are summarized by Willett (1988) as follows in [2-6]. Please note that the scope of [2-6] still deals with two major types of information source: direct (experiential) and indirect (inexperiential) evidence; although inexperiential evidence involves more than hearsay being different from popular "lay" understanding of evidential, i.e., evidentials equal hearsay. Therefore, logically, what an "evidentiality-conscious" speaker does is to involve information about his information source (direct, reported-indirect, or inference- indirect) into the modality of his proposition. 35 [2-6] Meanings of grammatical evidentials by Willett (1988:96) I. Direct evidence: the speaker claims to have perceived the situation described, but may not specify that it is sensory evidence of any kind. A. Visual evidence: the speaker claims to have seen the situations described. B. Auditory evidence: the speaker claims to have heard the situations described. C. Sensory evidence: the speaker claims to have physically sensed the situation described. This can be viewed as (a) in opposition to one or both of the above senses(i.e. any other sense), or (b) unspecified as to sensory mode (i.e. any sense). II. Indirect evidence: the speaker claims not to have perceived the situation described, but may not specify whether the evidence he does have is reported to him or is the basis of an inferences he has made. A. Reported evidence: the speaker claims to know of the situation described via verbal means, but may not specify whether it is hearsay (i.e. second-hand or third-hand), or is conveyed through folklore. 1. Second-hand evidence: the speaker claims to have heard of the situation described from someone who was a direct witness. 2. Third-hand evidence: the speaker claims to have heard about the situation described, but not from a direct witness. 3. Evidence from folklore: the speaker claims that the situation described is part of established oral history. B. Inferring evidence: the speaker claims to know of the situation described only though inference, but may not specify whether such inference is based on observable results or solely on mental reasoning. 1. Inference from the results: the speaker infers the situation described from his observable evidence. 2. Inference from reasoning: the speaker infers the situation described on the basis of intuition, logic, a dream, previous experience, or some other mental construct. 36 However, as I wrote earlier, the simple difference between direct and indirect is not enough to explain Japanese evidentials which seem to involve not only the speaker's knowledge but also the hearer's knowledge. In a later section I will attempt to incorporate the view from Revisionist Epistemology (Givon, 1982) that emphasizes the influence created by the hearer's knowledge of the speaker's proposition. Also, some indirect evidence (both reported and inferred) in [2-6] can be treated as direct evidence (in a sense) by a speaker in discourse depending on how intimate the speaker feels about the proposition. This Japanese concept of direct evidence will be explained by the concept of the speaker's psychological information territory. These factors regarding the Japanese concept of evidence require a unique evidentiality system framework that is not fully explainable by what is assumed to be the universall standard concept summarized in [26]. EXAMPLES OF GRAMMATICIZED EVIDENTIALS Before getting into the evidentials in the Japanese language, it would be useful to look at some examples of systems of evidentiality from various languages. Most languages do not have a grammaticized system of evidentials as some languages have; often evidentiality expressions reflect a speaker's subjective judgement so that a speaker is, theoretically speaking, free to choose his own system. English, as well as Japanese, belongs to this "free" group. 37 The following example is from the Tuyuca language (Brazil and Columbia) investigated by Barnes (1984). The following sentences in [27] can all be translated into English as "he played soccer." Tuyuca cases have been quoted in many studies since the language shows a clear example of a grammaticalized evidential system. [2-7] Tuyuca evidentials (a) diiga ape-wi (I saw him play. ------------visual) he play-evidential (b) diiga ape-ti (I heard the game and him, but I didn't he play-evidential see it or him.------senses other than visual) (c) diiga ape-ye (I have seen evidence that he played: he play-evidential his distinctive shoe print on the playing fields. But I did not see him play. -----------------------apparent) (d) diiga ape-yigi (I obtained the information from he play-evidential someone else. --------------hearsay) (e) diiga ape-hiyi (It is reasonable to assume that he did. he play-evidential -------------assumed) (Palmer, 1986:67) Palmer commented that the Tuyuka system is a case of grammaticalized "pure" evidentials (1986:67). It is reported that in Tuyuca, morphological forms of the verbal tense/person suffix function to indicate the source of information which the speaker's proposition is based on. Two types of direct evidence, (a) and (b), and three types of indirect evidence (c), (d), and (e) in [2-7] are encoded in the grammar. Since this is a part of the grammar of the language, speakers of Tuyuca 38 are required by the grammatical system to articulate the source of information. The next example of grammaticized evidentials is from Kogi (Chibchan, N. Columbia) studied by Hansarling (1982), and discussed also by Palmer (1986). The grammar of this language requires its speaker to be conscious of the hearer's knowledge. If a speaker judges his proposition to be known to both parties, he has to use the particle ni ("reminding"); if he assumes that his proposition is not known to the hearer, the na particle is used to indicate that the speaker is "informing". In case the speaker does not have a certain piece of information and assumes that his hearer has that information, he uses the shi particle ("asking"). If the speaker does not have a certain piece of information and he assumes that his hearer does not know either, the speaker uses the modality of the skan particle (expression of "doubt"). And if he is not sure if his hearer has information that he does not know, he is required to use the modality of the ne particle ("speculation"). A summary chart of the Kogi system is shown in [2-8] with sentencial examples. In the following figure, "+" indicates that the information is known, while "-" means that information is not known. 39 [2-8] Kogi evidential system (Palmer, 1989: 76) Evidential particles Speaker Hearer Function of evd. (a) ni + + remind (b) na + -inform (c) shi -+ ask (d) skan --doubt (e) ne -? speculate Sample sentences using (a) - (e) evidentials: (a') ni-gu- ku- a. (I did it just a while ago, as you know - remind) (b') na- gu-gu. (I tell you he did it some time ago - inform) (c') shi- na (Is that the way it is? - ask) (d') shag-gu (Who knows if it did just now? - doubt) (e') nabbi no guste ne ha gna (I wonder if it is a small lion, he thought - speculate) The evidential system in Kogi indicates "who knows what about the situation being discussed" (Hansarling, 1982:52 quoted by Palmer, 1986:76). Interestingly, this system is related to the psychological concept of information territory of a speaker which is, in this dissertation, introduced to conceptualize the Japanese system of evidentiality (see later section of this chapter). The most significant difference between Kogi and Japanese is that Kogi evidentials are grammarticized where those of Japanese are not. Palmer (1986) also suggested that Nambiquara (Brazil) is another example of a language in which epistemic modality is grammaticalized 40 in that various combinations of the speaker's and the hearer's knowledge are used as indicators of different epistemic modality. Lowe (1972) analyzed rather complex evidentiality in Nambiquara as a two- dimensional system with an "individual mode" and "collective mode" for event verification: speaker orientation : observation, deduction, or narration event verification : individual or collective verification According to Palmer, the speaker-orientation system of Nambiquara is equivalent to an evidential system: "observation" means sensory acquisition of information, "deduction" means existence of enough evidence for the proposition, and "narration" is hearsay speech. Event verification should be applied to each type of "speaker orientation". Therefore there are six matrices in Nambiquara's evidentiality system as summarized in [2-9] below: [2-9] Nambiquara evidentiality system (a) Individual observation: "I report to you what I saw the actor (=subject) doing." (e.g. He worked.) (b) Individual deduction: "I tell you my deduction of an action that must have occurred because of something I see or saw." (e.g. He must have worked.) (c) Individual narration: "I was told by someone that a certain action occurred." (e.g. I was told that he worked.) (d) Collective observation: "I report what both I and the addressee saw the actor doing." (e.g. Both you and I saw that he worked.) 41 (e) Collective deduction: "From what the speaker and the addressee saw, they deduce that a certain action must have taken place." (e.g. He worked, as deduced from what we saw.) (f) Collective narration: "Both speaker and addressee were told that a certain event took place." (e.g. It was told us that he worked.) The system of Nambiquara is different from that of Kogi in that it does not involve information which is known only to the addressee (the hearer). In Nambiquara, the speaker is required to pay attention to whether information is known only to the speaker or known to both parties. The cases of Tuyuka, Kogi and Nambiquara support the Revisionist Epistemology theory of Givon (1982) in that the existence of the hearer is an influential factor in evidentials in these languages. As is noted earlier, in the traditional idea of epistemology, the essence of the sentential mode was a matter of true or false. Therefore, traditionally, neither the speaker's subjective certainty nor the existence of hearers was considered to be important in theories of evidentiality. However, truth is rarely absolute. As Chafe claimed, "the study of evidentiality is about the human awareness that truth is relative, and particularly about the ways in which such awareness is expressed in languages" (1986: vii). In modern times, attempts have been made to show that at the bottom of propositional/sentential modalities lies an implicit contract between the speaker and the hearer. 42 From this perspective, Givon (1982) proposed to categorize propositions into three types: [2-10] (a) propositions which are to be taken for granted, via the force of diverse conventions, as unchallengeable by the hearer and thus requiring no evidentiary justification by the speaker; (b) propositions that are asserted with relative confidence and open to challenge from the hearer and thus require---or admit- -evidentiary justification; and (c) propositions that are asserted with doubt as hypotheses and thus beneath both challenge and evidentiary substantiation. They are, in terms of the implicit communicative contract, "not worth the trouble". (1982:24, italics in the original) As suggested above, for Givon, the knowledge level (the degree of necessity of the proposition) of the speaker and the hearer matters in deciding the necessity of evidentials. Givon rejected the concept of linguistic sentential modality which had been under the influence of the classic Platonic tradition, i.e., the traditional view of epistemology in which the essence of mode is whether the proposition is true or false by virtue of various modes of access to truth or knowledge. Givon stated: This [Platonic] tradition has derived the bulk of its support from linguistic analysis of a distinct kind: Propositions are considered in isolation from each other as to their truth and epistemic status. Sentential modalities thus appear to be an objective matter, to which neither the speaker nor the hearer--the two participants in the communicative transaction in which human language is actually used---bear any relevance. The recent renaissance in the study of communicative pragmatics has so far made nary a dent in this tradition. The speaker's subjective certainly is not considered seriously in traditional epistemology, but rather relegated to the realm of psychology. The hearer's role in the communicative transaction is not even contemplated. (p.24, italics in the original) 43 Consequently, the speaker's subjective certainty is an inferential by-product of the evidentiary, experiential aspect of knowledge, while the logician's "truth" is again an inferential by-product of both evidentiary source and subjective certainty. (p.25, italics in the original) Lyons (1977) also made a similar distinction between subjective and objective types of epistemic mode (as well as deontic mode): in his theory, the objective epistemic mode is a matter of degree of necessity, and subjective meaning is evidential by nature, but he did not elaborate on this concept. I think that Givon made two particularly noteworthy points: first, we should realize that we are dealing with the speaker's subjective certainty in dealing with necessity and possibility of the proposition which we assume to be objectively measurable; second, the speaker certainly pays attention to the hearer in choosing the evidentials since the chosen evidentials indicate the speaker's subjective certainty that might be offensive to the hearer in some way. I believe that these theories and analyses of sentential modality are also useful in analyzing discourse modality; they provide us with good understanding of modal meanings of isolated words and phrases which can be utilized in a larger scope of discourse modality. Givon's view is in line with the theories of "discourse modality" (e.g. Maynard, 1993) in arguing that a theory of sentence modality does not always reflect actual language use. This point will be elaborated on in later sections. 44 STUDIES ON JAPANESE MODALITY Interestingly enough, earlier this century, some Japanese linguists proposed that a sentence has propositional and modal contexts (e.g. Tokieda, 1950, Hashimoto, 1948). The idea was similar to Fillmore's later proposition (1968), although naturally the linguistic form of Japanese modality is different from that of English. English modals are easy to understand due to their close relationship with auxiliary verbs (e.g. do, have, shall, be, will, may, ought) which are morphologically independent. The functions of English auxiliary verbs are defined to express tense, person, number, and mood in accompanying and helping another verb. In Japanese, jo-dooshi ( . . . fi . ) are closest to English auxiliaries in their function.3 However, since the Japanese language is "agglutinative" by nature (cf. English is "inflectional"), the Japanese jodooshi are not morphologically independent, but usually attached to the main verbs or adjectives in a way that they look like a part of the main lexical item's conjugation. Hashimoto (1948) viewed jo-dooshi as independent lexical items. He suggested two types of jo-dooshi : those attached to nouns and adjectives, and those attached to verbs.4 Hashimoto proposed the concept of bunsetsu (phrase) in that, as he argued, jo-dooshi--together with the main lexical item which it is attached to--constitute a bunsetsu (phrase), and one or more bunsetsu constitute a bun (sentence). Tokieda (1941, 1950), in conjunction with his grammatical theory "gengo katei setsu" (theory of language as a 45 mental process), proposed to divide sentences into two parts: shi ( ..) (objective/subjective notions such as book, sad, etc.) and ji ( .á)(concept outside of objectifiable expressions). For Tokieda, shi is a result of "abstraction" (e.g. the word "book" is not the same as the object "book" but a linguistic abstraction of the object "book"), while ji directly represents a speaker's position which is in the abstraction process. In the following sentence, for example, yuki ga furu (snow falls) is shi, and kamoshirenai (might, perhaps) is ji: Bun (sentence) [2-11] shi ji Yuki ga furu -kamoshire-nai snow NOM fall AUX(might) (It might be snowing.) Kamoshire-nai expresses the speaker's view of shi (an objective event): yuki ga furu (snow falls). Thus, Tokieda claimed that jo-dooshi is an independent part-of-speech category, and shi and ji have different functions in that shi are "enveloped" in ji (1955:278). Tokieda also proposed to include verbal- adverbial- and nominal-suffix into the shi constituent as a setsubi-go (suffix) as shown in the following example: [2-12] Sentence shi ji Taroo wa sushi o tabe -nakat -ta -rashii. . Taro TOP sushi ACC eat NEG PAST seem jodooshi setsubigo jodooshi 46 (It seems that Taro did not eat sushi.) In [2-12], the negative auxiliary, -nakat-, is a part of shi (i.e., proposition). Thus, in Tokieda's view, jo-dooshi (i.e., Japanese AUX) is not always in the ji-phrase (i.e., modal) while in English, auxiliary verbs are usually modal. Tokieda's theory influenced subsequent research on Japanese syntax. There are differences among the researchers' concept of models; however, all seem to agree that a sentence has a modal constituent to "envelop" propositional context: Tokieda's ji, Yamada's chinjutsu (1951), Mikami's muudo (1963), Teramura's muudo (1979), and Nakau (1976) and Nitta's (1989) modality, all describe the same linguistic phenomena. Thus, it seems that the dichotomy of propositional content and modal content has been adopted in Japanese. Tokieda claimed that modal content syntactically involves all the constituents from a tense-marker to the end of the sentence and functions to express a speaker's subjectivity toward his proposition. This point also seems to have been adopted by other Japanese linguists, but exactly what should be included in modality is a topic of ongoing discussion. Nakau (1976) has defined modality as a speaker's psychological attitude at the time of speech. Nakau included tense, aspect, negation, question, and complementation in the domain of propositional content, and therefore, meant that modality content exists outside of this domain. Masuoka (1989) claimed that modality can exist in every constituent of a sentence, meaning that modality is also in propositional content. He 47 wrote that apart from the speaker-subjective "primary modality", a sentence has a "secondary modality" which can be objective. According to Masuoka, secondary modality includes politeness, transmission of thoughts, judgement, explanation, topicalization, and other functions in addition to traditional modal function (e.g. tense, aspect, and negation). It seems that the scope of Japanese modality is still unclear at least partly because the acknowledged definition of modality, "speaker's psychological attitude", can be interpreted in various ways involving numerous linguistic and psychological phenomena of language, but at least sentence-final ji is generally acknowledged as modality. STUDIES ON JAPANESE EVIDENTIALITY There are only a few studies which have focused on Japanese evidentiality per se (e.g. Aoki, 1986; Watanabe, 1984), although some studies of modality superficially refer to evidentiality (e.g. Nitta and Masuoka, 1989; Nakau, 1976). In the traditional view of sentential modality with a focus on auxiliaries, there are seven major modal auxiliaries that express epistemic modality, or epistemology (i.e., evidentiality, in this study), which qualifies a speaker's commitment to the truth of the proposition. Among them, four auxiliaries express modality of epistemic judgement:5 [2-13] Japanese auxiliaries of epistemic judgement Auxiliary Meaning hazu ---------------strong logical conviction, equivalent to 48 English must be, be expected to; ni-chigai-nai-------subjective sense-based inference, equivalent to English must, without a doubt; daroo --------------judgement of probability, equivalent of English probably; kamo-shire-nai ----judgement based on weak evidence, equivalent to English may be, might be. These auxiliaries in [2-13] are used to express "inferences" in that the proposition is based on some kind of warrant. Here, inference includes ones from results and reasoning (as in [2-6] in this chapter) which approximately covers inferential functions of so-called "deduction", and "induction", and perhaps "assumption", and "speculation" in the wider scope. Each auxiliary word expresses a different degree of necessity/non-actuality as well as speaker subjectivity. Johnson (1994:90) showed the following figure to indicate the possible relationship of necessity/possibility and speaker subjectivity: [2-14] Possibility Necessity Hypotheticality Subjectivity hazu (must) ni-chigai-nai (must) daroo (conjectureprobably) kamo-shire-nai (might) 49 In [2-14], we see that the "necessity" of the proposition is inversely proportional to "non-actuality" (i.e., "possibility"), "hypotheticality", and "subjectivity" ("a speaker's degree of conviction" by Johnson)6. This intuitively makes sense. Hazu implies the existence of strong evidence in the speaker's mind which allows him to make a strong deduction of necessity of the proposed event; therefore, in a sentence with hazu , degree of hypotheticality, possibility, and subjectivity of the proposition is very low; so, at the sentencial level, hazu should be used when the highest necessity is guaranteed. However, it might not so at the discourse level. For example, it can be assumed that the implication of the speaker's strong confidence attached to hazu (must) or ni-chigai-nai (no doubt) tends to be avoided when a speaker would like to be less assertive. As a result, it is presumable that, in discourse, hazu (must) and chigai-nai (no doubt) are followed by certain kinds of sentence-ending modalities to decrease the level of evidentiality. In the same epistemic modality group, four main modal auxiliaries are traditionally (in a limited sense) considered to express epistemic evidentiality as defined for this dissertation, which are often called "hearsay evidentials". A brief explanation of hearsay evidentials is as follows: 50 [2-15] soo (1) conveys second-hand information obtained directly or indirectly through any channel, equivalent to English I heard or I read or I was told; (2) expresses a speaker's conjecture about future or present events based on the information he obtained through sensory impression, equivalent to English it appears; yoo/mitai (1) expresses a speaker's suppositional judgement, equivalent to English it looks like, (2) expresses counter-factual impressions; rashii express a speaker's conjecture based on it seem, it secondhand information, equivalent to English looks like, or I heard; The first Japanese auxiliary of hearsay is soo. Soo is usually used with a copula as in sooda (plain), soodesu (polite). Soo (da) is used with two different meanings: "hearsay soo(da)" and "conjecture soo(da)". When preceded by tensed forms, a sentence with hearsay soo(da) conveys secondhand information obtained directly or indirectly by the speaker through any channel (e.g. hearing, reading) without any alteration by the speaker's subjectivity. As in the following example, in a hearsay soo (da) sentence, syntactically, the entire predicate before soo (da) is usually secondhand information: (2-16) Shinbun Newspaper ni yoru to according to Furorida ni Florida TEMP yuki snow ga futta NOM fell sooda. hearsay (According to the newspaper, it snowed in Florida.) (Makino, 1986:499) 51 In sentence (2-16), the part before the auxiliary sooda, i.e., "Furorida ni yuki ga futta" (it snowed in Florida) is hearsay information. Of course, Japanese has a verb phrase, S to kiita, which literally means I heard S . So the meaning conveyed by the next sentence (2-17) does not differ at all from sentence (2-16) except that the means of information gathering (audio) is more explicitly stated in (2-17) : (2-17) Furorida ni yuki ga futta to kiita yo. Florida TEMP snow NOM fell QUOT heard VOC (I heard that it snowed in Florida.) The other meaning of soo(da) is that of conjecture. Soo(da) can be an auxiliary adjective which indicates that what is expressed by the preceding sentence is the speaker's conjecture concerning an event in the future or the present state of someone or something based on the speaker's visual or other sensory impression, or intuition (Makino et al., 1986:410). "Conjecture soo(da)" occurs after the stem form of adjectives and verbs, and means appears to be. Syntactically, adding soo(da) to adjectives and verbs converts them into adjectival nouns. Observe the following example (2-18): (2-18) Furorida ni yuki ga furi sooda yo. Florida TEMP snow NOM fall(INF) appear VOC (It appears/looks like it will snow in Florida.) 52 Conjecture soo(da ) does not necessarily require the speaker's commitment to the proposition, thus "cancellation" of the proposition is possible: (2-19) Furorida Florida ni yuki ga TEMP snow NOM furi fall(INF) soo -datta appeared kedo but fura-nakatta fall -(NEG)(PAST) ne. CONF (It appeared/looked like it would snow in Florida, but actually it didn't-as we know) Hearsay soo(da) does not involve a speaker's commitment to the truth of the proposition. For this reason, there is an opinion that hearsay soo(da) should be excluded from the epistemic modality since it does not involve speaker's supposition with regards to the necessity of the proposition (e.g. Johnson, 1994). I consider this view to be appropriate for the sentence-level epistemology. But, from the pragmatic point of view, hearsay soo(da) certainly functions to present mood, since a speaker uses soo(da) when he does not want to commit himself to the necessity of the proposition, i.e., he is expressing reservation about the proposition or about the people to whom he is presenting the proposition. Further, from the evidentiality point of view, soo(da) is indispensable in representing the mood of "lack of direct evidence". For these reasons, I have included soo(da) in the genre of Japanese epistemic modality (and it actually turned out to be a very frequent mood-indicator in Japanese discourse data). 53 The second so-called hearsay auxiliary is yoo, an adjectival noun which is also usually used with a sentence-ending copula, da or desu. Yoo(da) also has two major meanings: suppositional judgement and metaphor. First, "suppositional yoo(da)" expresses a speaker's suppositional judgement in cases where the speaker does not have solid evidence to argue that his proposition is true, but for some reason, supposes it must be very close to the truth (e.g. Teramura, 1984). The following sentence is an example of suppositional yoo(da): (2-20) Doomo, sore ga umaku ikana-katta yoo-na no ne. somewhat that NOM well go(NEG)-PAST appear-STEM VOC RAPP (It somewhat appears that it did not go well.) Yoo(da) and mitai(da) function in the same way. They are almost interchangeable, but mitai(da) is more colloquial than yoo(da): (2-21) Are, yappari dame datta mitai yo. that as is expected no-good COP(Past) appear VOC (It appears that 'that' did not work as had been expected.) Yoo(da) and mitai(da) are also used in counter-factual situations to indicate metaphoric observation as in the next example, (2-22). However metaphoric yoo(da) and mitai(da) are used when the speaker knows the truth value of his proposition, so they are not indirect evidentials. 54 (2-22) Sucotto-san -tte marude Nihon-jin mitai desu ne. Mr. Scott QUOT as if Japanese appear COP COMF (Mr. Scott is just like a Japanese person-although he is not.) The fourth hearsay evidential, rashii indicates the preceding predicate to be the speaker's conjecture based on second-hand information, such as what he has heard, read, and seen. An English equivalent to rashii is it appears, I heard or it looks like. Rashii expresses a speaker's conjecture based on some kind of reliable evidence. In this sense, rashii functions in a very similar way to "suppositional" yoo(da) and mitai(da). (2-23) karuforunia -tte sugoku ie California QUOT very ga takai house NOM expensive rashii appear no VOC ne. RAPP (It appears that houses are very expensive in California.) However, as is noted, yoo(da) is often based on sensory information (visual information, in particular) while rashii is based on the information the speaker obtained in any numbers of ways from the environment. Makino et al. (1986) suggested that if there is relatively little conjecture in the speaker's mind, rashii is almost the same as the hearsay sooda, as is the case with the above sentence (2-23) in which the information (i.e. houses are expensive in CA) is widely known. We have seen so-called hearsay evidentials: soo(da), yoo(da), mitai(da), and rashii. It should be noted that it is wrong to simply call 55 this group of auxiliaries hearsay evidentials. Although all evidentials are based on information outside the speaker, with each auxiliary, the degree of the speaker's supposition involved and emphasis on sensory fields through which information is obtained are different. Hearsay soo (I heard) indicates that the speaker is simply conveying information that he obtained "as-is" without his manipulation; so, the speaker is not responsible for the truth value of the proposition when he uses soo(da). Therefore, hearsay soo sentence is least subjective. Rashii (it seems) is very similar to hearsay soo(da), but it differs from soo(da) in that it involves the speaker's supposition. Yoo(da) /mitai(da) (it looks like) also deal with information conveyance with the speaker's supposition. Yoo(da) is the auxiliary that a speaker uses in emphasizing the visual aspect of the information. The other soo(da) (i.e., "conjecture soo" ) also has an emphasis on visual and other sensory impressions on which the speaker bases his conjecture. But it differs from yoo(da ), in that speaker does not commit himself to the truth of his conjecture; he simply states his conjecture from what he has seen. Teramura (1984) attempted to measure degrees of the speaker's presupposition involved in the auxiliaries on a 3 point scale. He ranked "conjecture daroo" (probably), and "conjecture soo" (appears to be) at 3 (highest involvement), yoo (appears to be) at 2, rashii (seems to) at 1, and "hearsay soo" (I heard) at zero (p.260). These auxiliaries of evidentiality do not represent the entire epistemic modality but they are only part of it; there are numerous 56 other expressions of modality even at the sentence level across grammatical categories such as adverbs, adjectives, particles and hedges, and other specific semantic areas regardless of grammatical categories. Aoki (1986) paid attention to a specific semantic area, Japanese expressions of "sensation", the area in which evidential-like expressions are fairly grammaticalized.7 Japanese grammar requires its users to make a syntactic distinction between the description of a sensation experienced by the speaker and a sensation experienced by someone (or something) other than the speaker. When the speaker makes an inference regarding the feeling of others, it is necessary to add the verbal suffix, -garu, as in (2-27) below: (2-25) Watashi wa atui. (I am hot) . I TOPIC hot (2-26)* Kare wa atui. (He is hot). (*ungrammatical) He TOPIC hot (2-27) Kare wa atu-gatte-iru. (He is hot. ) He TOPIC hot STATIVE Since kare (he ) is the third person, sentence (2-26) is not grammatical. Sentence (2-27) with -gatte (gerundive form of garu) + iru (stative, non-past) is grammatical. Aoki explains that -garu has the function of expressing inference (based on indirect evidence) rather than direct experience.8 He supported his point by arguing that Japanese mimetic words expressing pain (usually adverbs) such as 57 chikuchiku (pricking), gangan (pounding), shikushiku (throbbing), and zukizuki (throbbing surface wounds) cannot be used with -garu. Pain may be perceived as something a person feels directly, so mimetic adverbs cannot be used with a third person subject as demonstrated in the following ungrammatical sentence (2-28). (2-28) *Kare wa zukizuki ita -gatteiru. He TOPIC throbbingly pain (He has a throbbing pain.) From the perspective of evidentiality, it is reasonable to assume that these expressions of sensation have been generally accepted as part of grammar due to the inherent difficulty of "knowing" other people's sensory feelings. A proposition such as he is hot is hardly attainable except in the case of literary texts in which a speaker (i.e., narrator) is supposed to be omniscient and knows all the characters' inner thoughts (cf. Banfield, 1982). Aoki also pointed out the function of the Japanese noun no (or "n", which is often called "nominalizer-no") as an evidential marker of fact. He noted that no may be used "to state that the speaker is convinced that for some reason something that is ordinarily not directly knowable is nevertheless true" (p. 228). For example, as shown earlier, the following sentence (2-29) is ungrammatical. But if the speaker adds no to the end, as in (2-30), the sentence will imply that the speaker has some evidence to assert that he is hot is a "fact". Perhaps, the speaker 58 might have witnessed the referent kare sweating or have heard kare complain about heat (evidence). (2-29)* Kare wa atui. (He is hot). He TOPIC hot (2-30) Kare wa atui no da. (I know that he is hot). He TOPIC hot NML COP Aoki comments that semantically no "removes the (preceding) statement from the realm of a particular experience and makes it into a timeless object. The concept becomes nonspecific and detached." (p. 229) I think what Aoki meant is that the propositional part of (2-30) kare wa atsui (he is hot) is presented as a fact in the speaker's interpretation by using the no-da sentence ending. Therefore, for Aoki, no can present the speaker's subjective judgement based on some kind of strong evidence. Actually, the function of no-da (or no-desu) seems varied even within a limited scope of evidentiality markings without being limited to Aoki's analysis (see chapter four and five for more discussion on this topic). So far examples of evidentials from auxiliaries and other areas have showed fairly "explicit" evidentiality in terms of lexical meanings. There are also discussions of the "implicit" phenomena of modality in Japanese. Akatsuka (1978, 1985) found that Japanese subjective judgment lies in subtle ways of using words such as the selection of conditionals and complementizers. Akatsuka paid attention to the aspect of epistemology of the speaker influencing sentence structure and 59 claimed that Japanese conditionals can be arranged on a scale of irrealis (hypothetical non-actual world).9 Iwasaki (1993) proposed the concept of "information accessibility" between a speaker and his proposition. Iwasaki claims that the speaker's awareness of how accessible his proposition is to him determines the speaker's choice of linguistic modality (i.e., tense, in particular). He found that a speaker tends to use more present tense in talking about a third person's past events than in talking about his own past events (cf. historical present tense by Wolfson, 1982). According to Iwasaki, a possible deduction is that a speaker usually has good knowledge about his own experience in the past which forces him to use past tense according to prescriptive rules: one's own information is more accessible than others'. These studies of "speaker subjectivity" (e.g. Iwasaki, Akatsuka) or "speaker epistemology" (e.g. Akatsuka) are related to the issues of evidentiality in that even at the sentence level, the grammatical structure of an utterance is partly a product of subjective judgement of the speaker. So far, we have briefly reviewed the studies of Japanese evidentials within the scope of sentence level modality. The existing theoretical scope of sentence modality is very limited in that it does not involve analysis of speech events, in particular, the existence of a hearer. As the theory of discourse modality (e.g., Maynard, 1992) suggests, it is necessary to broaden the focus of evidentiality phenomena when we deal with natural use of language. 60 FROM SENTENTIAL MODALITY TO DISCOURSE MODALITY Language users do not need to incorporate auxiliaries and other evidential expressions in "factual" statements which are unchallengingly true to everyone (cf. Givon, [2-10]). When this is not the case, a speaker often wants to show that he does not guarantee the truth value of his proposition one hundred percent by adding some kind of marker of epistemic modality. Therefore, theoretically, in an extreme case, if a speaker only talks about "facts" (not only in his understanding but widely known to be so), he does not need any kind of markers of epistemic modality. But is this possible? At the level of sentence grammar, it may be so; but at the discourse level, it is not. Even if a statement is known to be a simple fact, we certainly have occasions in which we feel some additional marker of epistemic modality will do good. One plus one is two is a logical fact known to most of us. However, in some kind of speech situations, we certainly say this phrase with some marker of modality added, for example, Isn't one plus one two? (although the statement One plus one might be two is rarely used). Imagine the case in which you know that your hearer is forty years old, and the hearer knows you know that. Then, if it is necessary to remind the hearer that he is a grown-up, perhaps the statement You are forty years old can be said, but you might add some epistemic "flavor" to it depending on when, where, and whom you are talking to. Aren't you forty years old?, You must be forty years old by now, or I thought you 61 were forty years old often sounds better than the declarative You are forty . Even though it is not true at all to say that generative and discourse grammar are mutually exclusive (Chomsky, 1980), a discrete concept of discourse grammar must be necessary in order to deal with the issues of pragmatic language use (e.g. Teratsu 1983, Inoue 1983). As Ricoeur (1981) said, discourse has a particular speaker or writer, a particular hearer or reader, and is made at a particular time, in a particular world. These traits of discourse naturally make its acceptable features distinctively different from the ones in Saussure's concept of "langue". The discourse meaning of epistemic modals differs from their meaning at the sentence level. The hearsay marker rashii (it seems) is said to be used to indicate that the speaker has obtained the proposition from outside and made an inference based on the information, but in actual conversational discourse, there are instances in which the speaker uses rashii in describing a proposition which he has directly obtained and is thus confident of its truth value. Observe the following sentences. (2-31) Kare wa kondo kachoo ni naru -rashii yo. He SUBJ this time section-head DAT become seem VOC Kinoo buchoo ni soo iwarete-ita no o kiita -nda. Yesterday dept-head DAT so told (PASS) STAT NLM OBJ heard N COP (It seems like he is going to be the section head this time. I 62 heard he was told so by the department-head yesterday, I tell you.) In sentence (2-31), the speaker directly obtained the information (overheard-auditory direct experience), but his usage of -rashii (seem) is quite acceptable. An indirect statement is indeed better than a direct statement since the propositions in (2-31) are about a third person's matter. In (2-31), the speaker obtained the information by overhearing rather than through a public announcement or from the referents (buchoo or kare). These factors prevent the speaker from using a direct expression although he knows the information is true. A direct statement would sound as though the speaker is meddling with other people's affairs. In this way, the actual usage of evidential markers is not always what would be expected from the rules of sentence modality: the matter of necessity/possibility of the proposition. The variation of markings is largely ruled by pragmatic discourse modality. Recently, Johnson took the position that sentence-level modality is "a subcategory of a larger picture of modality that is defined as a speaker's psychological attitude" (1994:46), meaning that sentence modality is only a part of the phenomenon of linguistic modality as a whole. As Maynard (1993) proposed, modality should not be limited to the sentence level but expanded to the discourse level. At the discourse level, a speaker usually has one or more hearers; therefore, knowledge about the hearer(s) will have some influence on the speaker's use of evidentials (cf. Givon, 1982 for the Revisionist view). Further, the speaker needs to be concerned 63 with the pragmatic consequence of his statement as it effects his goal of communication, his social image, and the relationship between himself and his hearer(s). Maynard (1993) suggested that the "modality of social interaction" cannot be wholly accommodated within the limited framework of previous studies of modality in that "discourse modality is a broader notion which includes not only the speaker's attitudes expressed by independent lexical items or combinations thereof but also those that can be understood only through discourse structures and in reference to other pragmatic means" (p. 39). Discourse modality, as referred to by Maynard, is, in short, a matter of language pragmatics in conversational or speech discourse since Maynard focused on how some selected discrete lexical items--for example, discourse connectives dakara (therefore) and dakedo (however), sentence-ending da (plain) and desu/masu (formal), interactional particle yo and ne--function in discourse. It is true that theories of sentential modality were often based on conveniently created sentences, or if they were authentic, such data was often from a limited range of speech events. The concept of discourse modality is a broader view of modality. Certainly, various aspects of discourse pragmatics can be viewed from the perspective of modality, and, this dissertation should also be considered a study of discourse modality. 64 CHAPTER 2: NOTES 1Obviously Chafe did not desire to commit himself to an overly restricted view of evidentiality. As Willet observed, evidentiality marking is so often interwoven with other areas of grammar, particularly tense and aspect (also cf. Chung and Timberlake, 1985), that to "extract" the "pure" aspect of evidentiality is often difficult. One example from Takelma is quoted below from Chung and Timberlake. In Takelma, the future tense differs from other modes in that future tense cannot be negated simply by adding a negative adverb; negative future events are expressed by the inferential mood (i.e., evidentiality) plus the negative adverb as in (2-33) (b) below. In Takelma, both "future" and "inferential" use the irrealis stem. (2-32) (a)Yana-?tt go(IRR)-3SG(FUTURE) (He will go.) (b) Wede yana-kt not go(IRR)-INFERENTIAL (He will not go/Evidently he didn't go.) I have observed that various aspects in modality are interwoven in English too. For example, in the sentence I could have done so and so , the modal auxiliary can is combined with "past tense" and "perfective aspect" resulting in signifying the mode of irrealis, i.e., a non-actual world with no-possibility. 2Often "source" means information source such as the speaker's direct sensory experience or somebody else's direct experience. But, Chung and Timberlake used the word to mean some entity whose point of view characterizes the event as either actual or non-actual: 65 For primary events, the source is typically the speaker; it is the speaker who identifies the event as actual, or imposes it on the addressee, or denies responsibility for its truth, and so on. For secondary events the source is typically the subject of the matrix clause. For example, with governing verbs of intention ('want', 'try') or obligation ('order', 'forbid') the subject of the verb provides the source of modality for the subordinate clause. (232233) Thus, for Chung and Timberlake, the source is speaker's subjective certainty if not transferring someone else's viewpoint for which the syntactical subject of the sentence is the source. 3Jo( . . . j help and dooshi ( . fi . meansj verb. Historically, means . there have been arguments on whether Japanese jo-dooshi are a part- of-speech or not. Ootsuka (1904) first introduced the concept of jodooshi into school grammar as a part of speech. Later, some linguists (e.g. Matsushita, 1930, Suzuki, 1978) argued that jo-dooshi are not a part- of-speech in that jo-dooshi simply help a verb to be conjugated and constitute a predicate. Hashimoto (1948) and Tokieda (1950) took the position that Jo-dooshi are a part-of-speech. Hashimoto proposed the concept of bunsetsu (phrase) in which jo-dooshi together with an independent lexical item (e.g. verbs, adjectives) constitutes a phrase which is treated independently as a new lexical item. 4Japanese verbs, adjectives, adjectival-nouns and copula are conjugated to mark for tense (non-past/past) and affirmative/negative alternations for several functional forms such as command, potential, imperative, conditional, volitional, passive, causative, and causative- passive. Each conjugated form has both plain and polite (formal) forms. Inflected parts (other than the "core" part) are often jo-dooshi (auxiliary) or setsubi-go (suffix). 66 5The description of the behavior of Japanese modal auxiliaries offered here is very limited. For more information, see Alfonzo (1966), Teramura (1984), Makino and Tsutsui (1986), Johnson (1994) and others. 6According to Johnson's definition, the term "subjectivity" indicates the degree of speaker's confidence in asserting that the proposition is true. When evidence is strong, the speaker can have a high degree of confidence (low or little subjectivity); and when a speaker lacks confidence in judging a situation, the judgment becomes highly subjective. 7 Most Japanese evidentiality expressions are not grammaticalized; however, some evidential-like aspects seem to be grammaticalized although their status is not clear (e.g. Watanabe, 1984). These days, expressions of a third party's sensations are being treated as grammar rules in many textbooks for Japanese-as-a-foreign language classes. 8 On this point, I disagree with Aoki. I consider that -garu expressions are based on a speaker's strong belief or inference which is based on his "direct" sensory information such as being directly told about the third person's feeling. Watanabe (1984) also discussed the verbal and adverbial suffix -garu in expressing sensations of a non- speaker. Watanabe viewed the phenomenon from the perspective of transitivity. He argued that, in Japanese, the construction of NOM-ACC has higher transitivity than that of NOM-NOM, and if a statement is based on direct evidence, the NOM-ACC of higher transitivity is required: (2-33)Masao ga kaminari o kowa-gatte-iru. 67 Masao NOM thunder ACC fear-DIR-STAT (Masao is showing fear of thunder.) (2-34) *Masao ga Masao NOM kaminari ga kowa-gatte-iru. thunder NOM fear-DIR-STAT (Masao is showing fear of thunder.) (2-35) Masao Masao ga NOM kaminari ga thunder NOM (It seems kowai rashii. fear seem that Masao is afraid of thunder.) Since word order is fairly flexible in Japanese, particles (e.g. ga, and o above) are used to assign cases. Watanabe considered -garu to be an auxiliary of direct evidence (cf. Aoki considered that -garu expresses a speaker's inference based on indirect evidence) which is only used for a high transitivity construction such as in (2-33), accordingly (2-34) with the combination of a low transitivity construction and -garu results in an ungrammatical utterance. A low-transitivity sentence construction is only used for an indirect statement such as (2-35). think that Watanabe's theory of the relationship between kinds of evidence and sentence transitivity is insightful. Watanabe characterized -garu as a direct evidence marker being directly opposed to traditional analysis of -garu as an indirect evidence marker. Being in agreement with Watanabe, I consider that the so-called indirect suffix garu is an evidential of "fact": a speaker can state other people's sensation subjectively as a fact (necessarily with some strong evidence, e.g. directly hearing from the target person.) This view is deduced from many speakers's use of -garu with other indirect evidential markers (e.g. rashii, mitaida, yooda, all meaninf seems) suggesting that sentence-ending -garu is rather assertive. This fact implies, at least pragmatically, -garu is understood as a "fact" marker. Usually, in conversation, unless the third person clearly states his feeling to the speaker, sentence (2-27) is said with the indirect marker such as -rashii as in (2-36) below: 68 (2-27) Kare wa atugatte-iru. (He is hot.) he TOP hot STATIVE (2-36)Kare wa atsugatte iru rashii (It seems he is hot.) he TOP hot STAT seem I have observed that direct statements (as 2-27) inferring other people's sensation based on the speaker's simple observation (e.g. finding that someone is sweating) are not often used in conversation. Usually, an evidential of a high degree of possibility is added (e.g. rashii, yooda, mitaida) in order to mitigate the potential offensiveness of the act of talking about someone else's feelings. Therefore, I consider thatgaru is not a complete evidential at the discourse level. Certainly -garu indicates the sensation of someone other than the speaker, which suggests there is distance between the speaker and the information. But if -garu is used as a direct sentence ending, the overall sentence modality is direct, implying the speaker's confidence in the proposition. This case shows that sentence-final modality may overrides inner sentence modality expressions at lease in some cases. -Garu allows a speaker to subjectively state other people's internal state of mind. This is one example of the subjective aspect of the Japanese language. 9 Each of the four main conditionals in Japanese (nara, tara, ba, and to) requires a different semantic environment for its grammatical use, but the four share the same meaning, which is equivalent to English if, when or whenever depending on the meaning of the consequent clause. Conditional expressions do not require the speaker's commitment to the proposition because they simply presents possible worlds in the conditional clause. Conditionals do not represent evidential meanings; thus, they are beyond the scope of this study, but they are certainly an important part of Japanese modality. 69 70 CHAPTER 3: DISCOURSE MODALITY IN JAPANESE In the last chapter, I demonstrated that modals--evidentials in particular--need to be investigated on the discourse level in order to understand their pragmatic use. On the discourse level, it is speculated that the existence of a hearer has a significant influence on the system of Japanese evidentiality. In this chapter, the issue of "hearer- sensitivity" of Japanese discourse, which appeares in the form of modality, will be further discussed. Discussing Japanese communication style, Clancy (1986) claims that, in Japanese culture, the main responsibility of communication lies with the listener: the listener must know what the speaker really means regardless of what the speaker literally says, however ambiguous, indirect, and reticent he may be. In contrast, she argues, in American- style communication, "the main responsibility for successful communication rests with speakers who must know how to get their ideas across" (p. 217): the speaker expresses his wishes, needs, thoughts, feelings in adequately explicit ways in words rather than indirectly or nonverbally. This claim seems to present an overly simple dichotomy on both sides. Clancy emphasizes that the Japanese style of communication depends on interpersonal "empathy" of a homogeneous society in which people anticipate each other's needs, wants and reactions without explicit verbal interaction. Clancy's contention makes it sound as if Japanese are "telepathic", which is, of course, not 71 necessarily the case. However, Clancy is likely correct in her claim that Japanese mother-child interaction focuses on the development of an empathetic speech-style in child cultural cognition. She suggests that Japanese empathetic communication style is a case of the language- culture relativism view advocated by scholars such as Whorf (1956) and Scollon and Scollon (1981). An important remark made by Clancy, which is relevant to this dissertation, is that Japanese communication is listener-oriented. LISTENER-ORIENTED MODALITY AND SENTENCE-ENDING FORMS Clancy said that in Japan "Communication can take place without, or even in spite of, actual verbalization. The main responsibility lies with the listener who must know what the speaker means, regardless of the words that are used." (p. 217) As suggested here, the listener may have the "responsibility" to correctly determine the meaning that the speaker intended to express. In this sense, Japanese communication style is listener-oriented because the speaker relies on the listener to understand his meaning which may be expressed in ambiguous ways. Clancy, perhaps, only paid attention to intentional "contextual" ambiguity in Japanese speech. Another important factor of listener- orientation in Japanese communication, which Clancy did not mention, and one that I believe is ultimately more important, is the speaker's careful observation of the listener's knowledge level. How does a speaker indicate his observation of the listener's knowledge? Sentence 72 ending modality marker functions to do this. It has been pointed out that the sentence-ending modality provides the strongest marker of mood in a Japanese sentence. Theoretically, a sentence can have several modals, but the mood of the last modal is usually accepted as the sentence modal. For example, as explained in the last chapter, the modal of hearsay evidentiality, yooda (appear), is the dominant modal in the following sentence as "report based on observation": (3-1)Kare wa atsugatte iru yooda (It seems he is hot.) he TOP hot STAT seem In (3-1), kare wa atugatte-iru (he is hot) grammatically presents the mode of realis but sentence-ending evidential yooda turns the mood of the whole sentence into irrealis. Masuoka (1989) explains that the general idea of mood construction in a sentence as follows: [3-2] Bun (sentence) Meidai Mitomekata Tensu Shingi-handan no modality no modality (Modality (Modality of (Modality of (Modality of of acknowledgement: tense) truth: subject ) affirmative/negative) necessity and possibility) Masuoka points out that there exists a hierarchical relationship 73 between modalities within a sentence. As the following diagram [3-3] indicates, the last modality of shingi-handan (necessity and possibility) holds the responsibility of deciding the final mode of the sentence. In this sense, this view is similar to that of Tokieda's (1941, 1950), which was introduced in chapter two, [2-12]. In the sentence, ame ga fura nakatta rashii (it seems it did not rain), rashii (it seems) presents the mode of the sentence as a whole: [3-3] Bun (sentence) Ame ga fura nak--katta rashii rain NOM fall NEG PAST it seems (Subject) (Modality of (Modality of (Modality of acknowledgement: tense) truth: affirmative/negative) necessity and possibility) (It seems that it did not rain.) Until very recently, the modality of the sentence ending had not received sufficient attention, while only the explicit lexical meanings of modal words were investigated independently on a word-by-word basis. The function of the sentence-final particles such as ne, yo, no, wa, sa, ze, and zo is one of popular issues of discourse pragmatics (e.g. Tokieda, 1951; Saji, 1956; Kitagawa, 1984). The study of the sentence-ending particles is a genuine discourse issue because sentence level grammar 74 does not require them, and accordingly, functions of sentence-final particles were not emphasized in Japanese-as-a-foreign-language classrooms until recently. According to Maynard's historical review (1992), traditionally, sentence-ending particles (shuu-joshi in Japanese), which only appear in speech with distinctive addressees, have been considered to be somewhat "interactional" since at least Tokieda (1951). Tokieda claimed that ne particle is used to the request hearer's sympathy and zo and yo function to force the speaker's view onto the hearer. Uyeno (1971) classified these particles into two categories: (1) those which express the speaker's insistence on forcing the proposition on the hearer (yo, wa, zo, ze, sa); and (2) those which express a request for compliance with the proposition but leave the option of confirmation to the hearer (ne, nee, na, naa). Kitagawa (1984) and Watanabe (1968) considered ne to indicate that the proposition of the sentence is related to the addressee. McGloin (1990) distinguished three types of functions of sentence-final particles: (1) zo, ze, sa, and yo function to "impart information which belongs to the speaker's sphere to an addressee"; (2) ne and na are "used to seek confirmation from the hearer"; and (3) ne, na, wa, and no function to create "rapport" (p. 36). All these researchers share an almost identical perspective on these particles. In summarizing the existing views and focusing on interpersonal aspects of the particles, Maynard (1992) called sentence- final yo and ne "interactional particles" and indicated that they were 75 also "discourse modality indicators" in focusing on different aspects of discourse modality: yo focuses on the informational aspect of the proposition, and ne focuses on interpersonal aspect in soliciting confirmation and emotional support. Since these sentence-final particles may involve the speaker's judgement of his hearer's knowledge (ne, na) and/or judgment of the necessity and possibility of the proposition (yo, sa, etc.), it is predictable that these sentence-final particles share important rules in Japanese evidentiality (cf. chapter four and five for details). In particular, the pragmatic function of the particle ne has been drawing attention since Kamio (1979) as an important modal in discourse. SENTENCE-ENDING FORMS AND THE SPEAKER'S TERRITORY OF INFORMATION When there was no general concept of the sentence-ending mood, the Japanese psychologist Akio Kamio (1979, 1985, 1987, 1990, 1994) proposed an insightful theory that a speaker, using sentence- final forms, linguistically marks the information territory to which his proposition belongs. Kamio applied the theory to English and Japanese and discussed the differences between the two languages in the speaker's concept of information territory. Regardless of the practicality of his view in modeling reality, Kamio's model offered a new perspective to the field of discourse pragmatics. As has been recognized, other than sentence-level grammar, there is a wide range 76 of uses of language that a person may need to have knowledge of and skill in performing to be considered a competent speaker of that language (e.g. Hymes,1979; Halliday, 1979). Through teaching Japanese, I have often felt that the appropriate usage of the sentence-final modality markings is one of the biggest issues for learners in becoming competent speakers of the language. Although this aspect of Japanese is not part of the language's grammar, it is an important pragmatic requirement of discourse (i.e., discourse grammar) which even native speaker language teachers would have a difficult time describing systematically. In Japanese, researchers have put some thought into the concept of discourse grammar. For example, Kuno (1978) attempted to formulate the rules of ellipsis and syntactical phenomena in discourse, and Inoue (1983) discussed Japanese particles wa and ga as markers of new/old information in a given discourse. The theory of territory of information by Kamio was, however, the first to discuss sentence- ending modalities as pragmatic rules of spoken discourse. Kamio's framework can be interpreted in such a way that most Japanese sentences in discourse must have the "right" kind of modality in the sentence ending if they are concerned with the hearer. The theory's major concern is the relation of the sentence-ending forms and the speaker's psychological concept of territory. Since epistemic markers usually reside in sentence modality (e.g. Palmer 1986; Willet 1988), and sentence modality is often found in the sentence-ending in 77 Japanese (e.g. Nitta & Masuoka, 1989), I consider Kamio's information territory theory to be also a theory of epistemic evidentiality. Kamio paid attention to the sentence-ending forms at the discourse level instead of at the sentence level. For example, the following Japanese sentences in direct forms are perfectly grammatical at the sentence level, but may sound inappropriate at the discourse level. The following sentences are all in direct ending forms: (3-4) O.J. wa muzai ni nat-ta. O.J. TOP innocent to become -PAST. (O.J. was found innocent by the jury.) (3-5) watashi wa anata ga suki desu. I TOP you ACC like COP(FOR) (I love you.) (3-6) Kyoo wa ii tenki desu. Today TOP nice weather COP(FOR) (It's a fine day.) The Japanese sentences above, which we teach in Japanese-as-aforeign-language class, are grammatical as they are. However, when used in actual communication, each of them sounds fairly "declarative" as they disregard the hearer's existing knowledge about the proposition. Or these sentences may sound "careless" about the hearer's possible disagreement with the proposition. Therefore, these utterances are often considered to be too assertive at the discourse level in many speech situations. Sometimes an assertive declarative sentence works well to serve the speaker's purpose; for example, sentence (3-5) is often used by a speaker who wants to confess his "one-sided" love to his target 78 who is not aware of the speaker's secret feeling. The following utterances (3-4'), (3-5'), and (3-6') encoded consciounesss of, or attempt to involve, the hearer's knowledge about the proposition by attaching modality markers at the end of the sentence: (3-4') O.J. wa muzai ni natta sooda ne. O.J. TOP innocent became heard PART(RAPP) (I heard that O.J. was found innocent by the jury.) (3-5')watashi wa anata ga sukina n-desu. I TOP you ACC like-n COP(FOR) (I love you, please understand/as you might know.) (3-6' ) Kyoo wa ii tenki desu ne. Today TOP nice weather COP(FOR) PART(SHAR) (It's a fine day, as we both know) In sentence (3-4'), auxiliary sooda indicates that the information is second-hand. Sooda (I heard) shows the speaker's consciousness of distance between himself and his proposition. English translations for (3-5') and (3-6') are almost the same as the corresponding ones for (3-5) and (3-6), while the pragmatic Japanese meanings are different. The nominalizer -n (or no) in (3-5') is said to mark the speaker's intention to explain, to persuade, to convince, or to give background information or new information as if it is already known to the hearer (e.g. McGloin, 1980: 144), and ne in (3-6') as well as (3-4') is said to indicate the speaker's awareness that the information is shared with the hearer (e.g. McGloin, 1990, Maynard, 1993, Kamio, 1979-1994, Takubo & Kinsui, 19901992). Therefore, in sentences (3-4') to (3-6'), modification to reduce 79 assertiveness is made through the sentence-ending forms. In relation with sentence-final modalities, Kamio proposed that there are two fundamental conceptual information territories: the speaker's and the hearer's territories of information. His early theory had only four types of information: the factors of "inside/outside of the speaker's territory" and "inside/outside of the hearer's territory" make two by two matrix resulting in four different types. And each information category was assigned with a single surface sentence- ending form: [3-7] Kamio's original concept of four information territories for a speaker Inside the hearer's territory Outside the hearer's territory Inside the speaker's territory TERRITORY A (information belongs to both speaker's and hearer's territories) direct+ne form TERRITORY B (information belongs only to the speaker's territory) direct form Outside the speaker's territory TERRITORY C (information belongs only to the hearer's territory) indirect+ne form TERRITORY D (information is out of both speaker and hearer's territories) indirect form This earlier framework of Kamio is relevant to a well-known psychological concept, the "Johari Window", developed by psychologists Joe Lust and Harry Ingham (e.g. Goffman, 1968). The Johari Window is 80 "a flat-pack, conceptual model for describing, evaluating, and predicting aspects of interpersonal communication" (Jarvis, 1996). This idea describes four different ways of how you are seen by others and how you see yourself, which demonstrates patterns of how people communicate with the outside world. This psychological view of human communication style assumes four different windows of the human mind which are classified by two sets of contrastive factors: "self" vs. "others", and "known" vs. "unknown": [3-8] Johari Window WINDOW SELF OTHERS DESCRIPTION of PANE KNOWLEDGE #1 known known public #2 known unknown hidden from others #3 unknown known blind to self #4 unknown unknown unconscious OTHERS known unknown #1 #2 known SELF #3 #4 unknown This concept suggests that an individual views himself as well as others through one of these panes in each social interaction. Although 81 the concept deals with the self-image of an individual, the foundation of the Johari Window is equivalent to Kamio's concept in that information (for Kamio) or self-image (for the Johari concept) is viewed with in relation with how it is known or perceived by himself and other people, thus the concept of information territory is also a psychological issue. Later Kamio (1994) revised the theory and argued that information has a relative and gradable character, so sometimes it falls completely in the territories of both sides and sometimes it falls more in one side than in the other. Based on this idea, Kamio assumed six different "cases" of interaction of the speaker's and hearer's information territories, in which most of our daily utterances fall. Kamio said that the sentence-ending form of each utterance reflects the types of interaction of the information territories to which utterances belong as shown in [3-9]: [3-9] Cases of interaction of the speaker's and Sentence-ending forms the hearer's information territories in Japanese discourse (A) The speaker's territory only (e.g. I have a headache.)------------direct form (B) Both Speaker's and Hearer's territories (and information is completely shared) (e.g. It's a beautiful day.) ---------------direct+ne form (BC) Both Speaker's and Hearer's territories (but the speaker considers the information to fall more within his own territory than in the hearer's territory.) (e.g. My sister is pretty, isn't she?) -------------daroo(deshoo) form 82 (CB) Both Speaker's and Hearer's territories (but the speaker considers the information to fall more deeply within the hearer's territory than in the speaker's territory.) (e.g. You are Mr. Yamada, aren't you?)--------------daroo(deshoo) form janai form (C) The hearer's territory only (e.g. It looks like you are feeling sick, aren't you?)-------indirect+ne form (D) Neither the speaker's nor the hearer's territory (e.g. It seems that it will be fine tomorrow.)-----------------indirect form Japanese sentences which correspond to the above English sentences are shown below: [3-10] (A) watashi, atama ga itai. I head NOM aches. (direct) (I have a headache.) (B) ii tenki desu ne. fine weather COP(FOR) PART(CONF) (It's a nice weather as we both know.) (BC) Uchi no imooto, kirei daro. my POSS younger sister pretty AUX(tag-question) (My sister is pretty, isn't she?) (CB) Yamada-san deshoo? Mr. Yamada AUX(confirmation) (You are Mr. Yamada, am I right?) (C) kibun ga warui mitai desu ne. feeling NOM bad appear COP(FOR) PART(CONF) (You seem to be feeling sick, aren't you?.) (D) Ashita wa hareru daroo. tomorrow CNT get fair AUX (conjecture) (It will probably be fine tomorrow.) (The sentences are selected from Kamio, 1994: 87-98 and presented with minor modifications) 83 Kamio argued that a speaker unconsciously uses the distinctive sentence-ending forms described above depending upon the "case" type to which his proposition belongs. Perhaps, however, the framework represented in [3-9] and [3-10] is too simplistic: a speaker's awareness of each of the six cases of territory interaction is simply connected with the use of a single surface linguistic form to represent each case of territory interaction. Actually, I have found in the data additional linguistic forms used related specifically to each case; therefore, further analysis on the use of all the possible sentence-ending forms, and on how those forms can be integrated into the whole system of information territory is necessary to complete Kamio's framework. Kamio's theory explains the phenomenon that the usage of direct sentence forms in Japanese is pragmatically limited at the discourse level. According to Kamio's model, only information which belongs to (A) type information of [3-9] and [3-10] (the speaker's territory only) is legitimately expressed in direct forms. Kamio identified three groups of information resources which are relevant to the notion of the speaker's territory of information as described in [3-11] below: [3-11] (a) information obtained through the speaker's direct experience; (b) information about persons, facts, and things close to the speaker, including information about the speaker's plans, actions, and behavior, places to which the speaker has a geographical relation; and (c) information embodying detailed knowledge which falls within the speaker's professional or other expertise. 2 84 The theory suggests that a speaker is considered to have "sociallylicensed" privileged access to information which belongs to classes (a), (b), or (c) of [3-11] (at least in the Japanese communities). Factor (b) (i.e., a speaker is entitled to consider the information about persons, facts, and things "close" to the speaker as his own territory information) presents an outstanding aspect of the Japanese sociolinguistic norm. For a Japanese speaker, information about other people in his uchi (inside) group (e.g. family matters) is in his own territory although Kamio did not emphasize the sociolinguistic meanings of the (b) factor. The (a) factor is universally acknowledged direct evidentials. Factor (c) is also understandable. The pragmatic restriction placed by (a), (b), and (c) to the direct sentence-ending results in a large proportion of Japanese sentences being produced with indirect modality which belongs to territorial interaction types (B), (BC), (CB), and (D) of [3-9] and [3-10] i.e., the indirect form territories. For type (B) information, the speaker needs to use the direct form plus particle ne . It is another direct territory of the speaker, but since the information is shared by the hearer, the particle of information sharing, ne, should be added. For (BC) type propositions, since the speaker is asking for compliance of the hearer to the proposition of his own information territory (which is also shared by the hearer to some extent), the auxiliary of compliance-getting, daroo (deshoo in polite form) should be used.3 Type (CB) propositions should end with auxiliary 85 daroo or negative question form janai since the proposition falls more into the hearer's territory, and the speaker is asking for agreement to what he believes is shared with the hearer. Only the hearer is supposed to have access to (C) type propositions, so the speaker must make sure to express that he is out of his territory by using an indirect form to utter the proposition. Particle ne is also obligatory with (C) type information as in (B) since the proposition falls deeply within the hearer's territory and the speaker asks for the hearer's assent. (D) type information does not fall in either the speaker's or the hearer's information domain, therefore, should be expressed exclusively in the indirect form. In this case, "optional ne " can be added. Optional ne is different from the "obligatory ne" of cases (B) and (C) in that optional ne functions to send "rapport" (e.g. McGloin, 1990) while obligatory ne asks for the hearer's assent or compliance (see chapter four for the analysis of ne). Kamio's interest was in the functional analysis of the Japanese language, so he did not literally emphasize the sociolinguistic and pragmatic aspects obviously involved in his model. I believe that Kamio's model may contribute to the studies of sociolinguistic and pragmatic analysis of the Japanese language in the following three major aspects: [3-12] (a) The theory presented the domain of sociolinguistic territory of the Japanese concept of "close" information to the speaker, and accordingly provided a reason why indirect mood is dominant in Japanese spoken discourse. 86 (b) It suggested that the use of Japanese evidentiality of given information is "relative" to the hearer's knowledge within a given discourse. (c) In accordance with (b), the theory characterized the pragmatic function of the final forms, e.g., particle ne, auxiliary daroo (deshoo), direct, and indirect forms as the sentence-final mood indicators. This concept is remarkable in contrast with the traditional approach from sentence grammar. Although Kamio's theory deals with human psychological territory of information which potentially involves some sociolinguistic aspects, attention was not paid to contextual variables of discourse which are possibly influential to the model of information territories. Therefore, discourse variables such as nature of participants, speech settings, were extensively emphasized in this study in order to locate the sociolinguistic aspects of the Japanese evidentiality system. Japanese evidentials are not only based on the ways that information is obtained as universal rules of evidentiality define (i.e. direct evidence/experience vs. indirect evidence/experience). It seems that the Japanese system is also based on the speaker's awareness of his hearer's knowledge. As noted earlier in chapter two, languages such as Kogi and Nambiquar share the same kind of hearer-conscious concept of evidentiality with the Japanese language (and Kogi and Nambiquar's systems are grammaticalized). So the phenomenon of "psychological territory for information" is not unique to Japanese. In fact, the phenomenon is not limited to a small number of languages: we find similar concepts in English too. Labov and Fanshel 87 (1977) analyzed "therapeutic interviews" between mental patients and their psychotherapists. In doing so, they categorized the initiation from the psychotherapist into five event categories which are A-, B-, AB-, O-, and D- events. This classification of statements according to the shared knowledge involved was done for the purpose of anticipating the "syntagmatic" structure of responses from the patients, therefore, the authors' interest was in the characteristics of responses to each event category, and is irrelevant to this study. However, the authors' method of categorizing therapeutic speech from the viewpoint of information territory is useful. Their categorization of the therapist's speech events follows in [3-13]: [3-13] A-event: events to which the speaker (A) has privileged access. B-event: events about which the hearer (B) has privileged knowledge. AB-event: knowledge which is shared by A and B. O-event: events which are known to everyone present and known to be known. D-event: events which are known to be disputable. The authors said that "these classifications refer to social facts-- that is, generally agreed upon categorizations shared by all those present" (p. 100). Stubbs (1983) evaluated their study and explained the concept of event-classification as follows: A-events are events to which the speaker has privileged access, and about which he cannot reasonably be contradicted, since 88 they typically concern A's own emotions, experience, personal biography, and so on. Examples include I'm cold and I don't know. Notice how, in school classrooms, a statement such as don' t know may be the only one to which a pupil is not open to correction. B-events are, similarly, events about which the hearer has privileged knowledge. A cannot therefore normally make unmitigated statements about B-events, such as you're cold, unless A is in authority over B, for example, as mother to child. Statements about B-events would normally be modalized or modified: You must be cold or You look cold. (118-119) Labov also uses three other related terms. AB-events are defined as knowledge which is shared by A and B, and known by both to be shared..........O-events are known to everyone present, and known to be known. D-events are known to be disputable. There is therefore a classification of utterances according to the amount of shared knowledge involved. These definitions of AB- and O-events are comparable to the way in which the term pragmatic presuppositions is often defined, as propositions which are established by the preceding discourse, or which can be assumed to be generally agreed. (119) As to A-events and B-events, Labov's and Kamio's views are almost identical in that "A-events are those that typically concern A's emotions, his daily experience in other contexts, elements in his past biography, and so on" (1977:100). Accordingly, Labov and Fanshel stipulated the "Rule of Confirmation" for a response to be coherent to the discourse that "if A makes a statement about B-events, then it is heard as a request for confirmation." Responses to assertions are heavily determined by the relation of the proposition being asserted to knowledge shared by the participants. If A asserts an A-event, he normally requires only an acknowledgement of a minimal kind: he often uses such assertions to introduce a narrative; B simply must show that he is prepared to pay attention during an extended turn at talk. In the special case that A makes an assertion about a B-event, his utterance is heard as a request for confirmation. Assertions about AB- or O-events come closest to the concept of remarks: utterances that make minimal demands for response. (101) 89 Therefore, Labov and Fanshel paid attention to the hearer's responsibility, in English communication, to understand the event category to which the speaker's proposition belongs (through both context and structure, perhaps) and to correctly reply as expected. In my observation, in Japanese communication, the speaker is responsible for indicating the category of the proposition properly through sentence-ending forms and a reasonably polite hearer respects a reasonably polite speaker's decision on sentence-ending forms. If the speaker used a direct evidential for a given piece of information, the listener accepts that the proposition belongs to the speaker's territory and will use indirect forms to talk about it himself; thus, if the hearer does not agree, when he talks he might need to show where he considers the propositional information belongs. Labov and Fanshel acknowledged O-events and D-events as two distinctive categories. They said that "the clearest interactional consequences follow when A makes an assertion about a D-event...If A makes an assertion about a D-event, it is heard as a request for B to give an evaluation of that assertion" (the "Rule of Disputable Assertions" of discourse coherence). (p. 101) In their view, it seems, whether the event is thought to be known or disputable makes a difference in English speakers' acceptance of what is heard. We can raise some issues with their analysis. First, the border between O-events and D-events can be very fuzzy. On this point, the authors claimed that one's "pragmatic presupposition" decides whether 90 a certain event is O, or AB, or D. A speaker's subjective decision is assumed to be in this process. I find this exercise of subjectiveness to be a very interesting issue. In a given culture, how much subjectiveness are people allowed to exercise in terms of linguistic expression? The social norm of the degree of acceptance of the speaker's subjectivity must be different from one culture to another, and from one language to another. In my 1994 study, it was found that American informants expressed third party information as everybody's events more often than Japanese informants did. So I have argued that for Japanese speakers, public information remains, true or not, other people's information until the end, at least linguistically; and in the Japanese speaker's psychology, it seems, both O-events and D-events belong to the same territory (i.e., other people's information) and stay there forever. Even after the epistemic "necessity" of the proposition is confirmed, this information is expressed in indirect forms. Based on this observation, I have further argued that American culture is more belief-oriented than Japanese culture in that each speaker's belief on the proposition influences the linguistic forms of public events in American culture, while in Japanese psychology, the border of the information territories between "others" and "mine" is not flexible. However, in this research, Japanese speaker's behavior with regards to O- and D-events was not significantly different from that of English speakers. I attribute this discrepancy between the two studies to a significant difference in degree of general public familiarity with certain public events at each 91 time (cf. chapter five). There are opinions that there is no such thing as information territory. For example, in criticizing Brown and Levinson's "face" concept, Matsumoto (1988) quoted Nakane (1967) and said that the Japanese culture is group-oriented so that the concept of individual territory is not typical among Japanese people. Matsumoto said, correctly I think, that the Japanese language is particularly sensitive to social context, especially to one's position in relation to others. But I consider that this group-orientation of Japanese society does not necessary mean that Japanese people do not have a sense of territory. Every human being (probably all animals) has some concept of personal territory. Discussing "space" in Japanese behavioral psychology in relation to the group-oriented nature of Japanese society, Japanese psychologist Kimura (1977: 20-24) referred to the theories of world-famous psychologist Levin, German behavioral scientist Lorentz, and others. These scholars experimentally investigated the functions of human concepts of self "position" and "territory, and required psychological energy to move out an individual's territory into other people's territories. I believe that Japanese people have a sense of personal territory as well as group territory, at least they demonstrate this linguistically. The following sentences show the speaker's sense of group territory and personal territory respectively: (3-14) Uchi no kaisha no jinji-bu, 92 my household POSS company POSS personnel dept. zenzen dame yo. at all bad PART( VOC) (My company's personnel dept. is inefficient very much.) (3-15) Uchi no okusan warito nonbiri shitete sa. my household POSS wife fairly laid-back STAT PART(VOC) (My wife is fairly laid-back.) In (3-14), the speaker called the company he works at uchi-nokaisha (lit. my household company) and used a declarative form to talk about it. In (3-15), talking about his wife, the speaker also used a declarative mood. In both utterances, it seems that each of the speakers felt that the information was within his territory; group territory in (314) and personal territory in (3-15). The overall discourse data indicates that people talked about their professional knowledge, their direct experience, their family, home town, and other things as information to which they have privileged access, (i.e., the knowledge in their territory2) and used direct mode to talk about them. Good evidence is the linguistic negotiation of territory borders which is often seen in subtle morphological modification by conversationalists. If you said to your conversational partner who happens to be a linguist that there is a linguist called Noam Chomsky. He is coming to Texas to lecture on his political view, your behavior would be considered inappropriate in disregarding your partner's information territory. But if your conversational partner is a rational 93 adult, instead of yelling I know Noam Chomsky!, he might say nicely oh, is that what he's talking about this time? In saying so, he shows that the person named Noam Chomsky and his affiliated information are within his information territory as a linguistic professional. This kind of negotiation of territory on the deictic level often happen in Japanese since direct and indirect deixis are important evidentials in the language. In Japanese, unlike English, third person personal pronouns and proper nouns cannot be used by both conversationalists if the referent is not known to both of them. Observe the following English conversation: (3-16) A: I met Dr. Yen yesterday. B: Who is Dr. Yen?/he?/that person? In (3-16), in English, speaker B can use the proper name, the pronoun he or the phrase that person referring for the referent. In Japanese, since speaker B does not personally know the referent, Dr. Yen, speaker B cannot use the proper name (Dr. Yen) or the pronoun he . The following (3-17) and (3-18) are acceptable utterances in Japanese which correspond to English (3-16B): (3-17) B: Dr. yen -tte dare? Dr. Yen QUOT who (Who is the person called Dr. Yen?) (3-18) B: Sono hito wa dare? 94 that person TOP who (Who is that person?) In (3-17) the indirect quotation marker -tte ( or -to iu) (called) and in (3-18) the demonstrative sono (that) are used to indicate a referent who is out of the speaker's information domain. The following (3-19) and (3-20) with the proper noun and the personal pronoun he respectively are not grammatical when speaker B does not know the referent: (3-19) B: *Dr. Yen wa dare? Dr. Yen TOP who (Who is Dr. Yen?) (*ungrammatical) (3-20) B: *Kare wa dare? he TOP who (Who is he?) Ungrammatical sentences (at the discourse level) such as (3-19) and (3-20) are frequently used by learners of the Japanese language, even by those of advanced levels, and teachers do not dare to correct them because the utterances are grammatical at the sentencial level. As a matter of fact, in both English and Japanese, speaker A in (316) should have said from the beginning that "I met a person called Dr. Yen, yesterday" if he had known that B did not know Dr. Yen, or if he was not sure about B's knowledge: (3-21) Kinoo Dr. Yen -tte iu hito ni atta n da. yesterday Dr. Yen QUOT person DAT met n COP (Yesterday, I met a person called Dr. Yen.) 95 Sentence (3-21) is more natural than (3-16)A in most cases in both English and Japanese conversation when the speaker knows that the hearer does not share the knowledge of the referent. Therefore, it is evidently true that in both English and Japanese, the speaker is supposed to be conscious of his hearer's knowledge in deciding the sentence structure (cf. the use of definite and indefinite articles in English). In terms of deixis, Japanese is more "persistent" than English in that a Japanese speaker cannot use proper nouns/third person pronouns for the referent if the referent is not in his information domain. This restriction does not change within a given discourse even after the referent is introduced and fully explained by one of the discourse participants (e.g. Kuno, 1988, Shibatani, 1990, Takubo and Kinsui, 1992).4 Lacoste (1981) showed some interesting examples of negotiation of speech territory between doctors and their patients in French. Doctors are positioned higher than their patients since they use their professional skills to help patients, but, at the same time, they also depend on the patients' description of their physical condition to enable them to use those skills. Lacoste found, therefore, that often in medical interviews the boundary between "patient's events" and "doctor's events" are blurred and fluctuating. Patients used their knowledge of their physical condition, and made attempts to linguistically invade doctor's territory, while doctors, on the other hand, defended their 96 professional territory by brandishing their professional knowledge. One example of linguistic territory negotiation on the lexical level is shown below: (3-22) Doctor: (a) Depuis quand avez-vous mal au ventre? (How long have you had this pain in your stomach?) Patient: (b) J'ai jamais eu mal au ventre, j'ai eu mal a la rate. (I've never had a pain in my stomach. I have a pain in my spleen.) Doctor: (c) Ecoutez, la rate vous n'etes pas force de savoir ou c'est, vous avez eu mal au ventre. (Listen, the spleen, you are not supposed to know where that is, you had a pain in the stomach.) Patient: (d) J'ai mal la (geste de designation). (I have a pain there/designative gesture) Doctor: (e) Comment vous appelez ca? C'est le ventre. Vous avez mal au ventre. (What do you call that? That's the stomach. You have a pain in the stomach.) Patient: (f) Si vous voulez. (If you say so.) (Lacoste, 1981: 172) Obviously, the doctor in the above conversation was not happy with the patient's use of the word la rate (spleen) as well as patient's assertion that he had pain in his spleen. The event belongs to the doctor's territory (i.e., professional knowledge). In (3-22c), the doctor's utterance vous aves mal au ventre (you have a pain in the stomach) sounds too direct in speaking about other people's pain, but is supposed to be acceptable due to his profession. The next example shows negotiation of territory in Japanese 97 through the sentence-ending modality. (3-23) Child A: ashita okaasan i -nai yo tomorrow mother exist -NEG PART(VOC) (Tomorrow, our mother will be out.) Child B: uso da yo. Iru yo! lie COP VOC exist PART(VOC) (It's a lie. She will be here). Adult C: (talking to A) A-chan, okaasan soo i-tte-ta? your mother so say STAT PAST iru to omo-tta-n da kedo naa. exist QUOT think-PAST-n COP but PART(RAPP) (Dear A, did your mother say so? I thought she would be here, but...) In sentence (3-23A) and (3-23B), both children (brothers) used direct endings (declarative modality) indicating that the information about their mother is within their personal information territory. Child A indicated that the information was "his" using the direct modality, therefore, Child B also used direct forms to negotiate territory. Both of them could have used an indirect sentence such as I thought mother would (or wouldn't) be here as Adult C did in (3-23C), but children did not prefer this alternative, presumably because they do not want to be polite to each other; they are young and their relationship is intimate. ANOTHER VIEW OF LISTENER-ORIENTED MODALITY IN JAPANESE There have been some criticisms of Kamio's model. Except for one researcher who specifically stated that Kamio's model is not 98 applicable to the Japanese system of demonstratives (Ono, 1995), those rejecting or criticising his model have not presented clear reasons of disapproval; generally, the antagonists of the model simply claim that the concept of information territory does not seem to be applicable to Japanese linguistic phenomena as a whole.5 There has, however, been another major approach to the pragmatic functions of Japanese sentence-ending forms on the discourse level. Takubo and Kinsui (Takubo, 1990 and 1992; Kinsui, 1990; Takubo and Kinsui, 1990, 1992) proposed a Japanese discourse model based on Fauconnier's mental space theory6 as well as discourse marker theories by Schiffrin (1987) and others. They named the theory "danwa kanri riron" (theory of discourse management). I will call their theory "mental space theory" in this chapter. As the name implies, this theory attempts to explain Japanese linguistic issues from the viewpoint of the speaker's assumption about the hearer's knowledge about the proposition expressed. It is true that we usually have some particular hearer in mind any time we make an utterance. A speaker needs to take the hearer's knowledge into consideration, and choose appropriate linguistic forms such as words or sentence structures. When the speaker introduces a new issue in discourse, he needs to linguistically indicate that the issue is new (e.g. Yesterday, James found a peach in our yard.) After the speaker introduces a new issue, which is not shared by the hearer, the speaker, before making his next utterance, needs to 99 consider how the hearer's knowledge has been changed by the information that he has just given to the hearer (e.g. The peach was actually a giant peach). This kind of discourse managing behavior based on the hearer's assumed knowledge is normally seen in every language, but how to do so must vary across languages. The theory of discourse management assumes that mental space is a discourse management system. Mental space is considered to be a layered database, and each utterance in conversation is a kind of command to use the database to register, search, infer, and so forth. The authors claimed that in Japanese, mental space is divided into two areas: a "direct experience area" and an "indirect experience area". The direct area involves long-term memory, episodic memory acquired through direct experience, and knowledge that is obtained from the on-going conversation. The indirect area contains information that is obtained linguistically (i.e., reading or hearing as indirect experience). In the mental space theory, the hearer's assumed knowledge is speculated to be in the indirect memory area of the speaker. In short, Takubo and Kinsui suggested that we have three interacting areas of memory: the direct information field (for directly obtained knowledge), the indirect information field (indirectly obtained knowledge), and the hearer's knowledge field within the speaker's indirect information field (since it is only assumed by the speaker as his indirect experience). Their theory assumes that the sentence-ending modality and other modals are the speaker's "message" to the hearer or the speaker himself to organize 100 memories in different memory areas. As Fauconnier hypothesized that the same information exists in multiple mental spaces and is described differently linguistically, Takubo and Kinsui assumed that the same information can exist in different connected memory spaces. They attempted to explain nouns/third person pronouns, sentential-final particles, and demonstratives in order to indicate how the speaker interacts with the same information in different memory spaces.7 I do not consider Takubo and Kinsui's approach to be significantly different in effect from Kamio's model at least on the issue of the relationship between the sentential ending forms and proposition types. As Kamio had, Takubo and Kinsui paid attention to hearer- sensitivity of Japanese sentence-final forms and explained the function of the forms. Takubo and Kinsui used the concept of memory space of the speaker and the hearer, while Kamio used the concept of information territories of speaker and the hearer as [3-24] shows below. In both models, forms of sentential modality are related with the types of information. Both theories assume four similar basic categories of evidentiality types. The difference between the two model is that in Kamio's model sentence-ending forms and information domain are simply connected, while the mental space theory viewed particular words (including the sentence-ending forms) to show distinctive "signs" or "commands" presented by the speaker in organizing 101 information in memory space of both himself and his hearer. For example, Takubo and Kinsui (1992) claimed that the Japanese sentencial final particle ne expresses the speaker's "command" for confirmation if information exists in two places (his memory and hearer's memory). [3-24] Information territory theory vs. mental space theory Type of events Information territory theory Mental theory space Evidentiality direct information the speaker for In the speaker's territory (A) (direct ending) In the speaker's direct memory space (a) direct (speaker's evidence) indirect information the speaker for In the other people's territory(D) In the speaker's indirect memory space(b) indirect (indirect ending) direct information the hearer for In the hearer's territory (C) (indirect + ne ending) In the hearer's memory space in the speaker's indirect memory space (c) indirect (hearer's evidence) shared information for the speaker and the hearer In the shared territory(B, CB, BC) (daroo, ne- related endings) (a) or (b) and (c) direct (shared) (Note: A, B, CB, BC and D are from [3-9, 3-10] in this chapter.) (3-25) kimino tanjoobi wa san-gatsu desu-ne. your birthday TOP March COP(FOR) PART(CONF) 102 (Your birthday is March, we have the same information, don't we?) In (3-25), the proposition is the hearer's matter but the speaker knows it too, so the speaker confirmed the existence of the same piece of information in two places of his own memory--the speaker's indirect memory area and assumed hearer's memory area in the speaker's indirect memory area--by saying (3-25), where ne is the "sign" of this "memory-matching" action. Takubo and Kinsui characterized the final particle yo as a speaker's command to the hearer to write information in the indirect memory. That function perhaps can be phrased as "speaker's declaration of some speaker's matter" which the hearer does not have knowledge. (3-26) A: ogenki desu-ka? well/active COP(FOR) Q (How are you?) B: watashi wa moo 70 desu-YO. I TOP already 70 COP(FOR) VOC (I am already 70 years-old, now you know I must not be very well.) (Takubo, 1992: 23) In above conversation, the surface meaning of B's answer (i.e., am already seventy) is not straightforwardly relevant to A's question, therefore, considered to be a case of an "implicature" in Grice's concept. In English, the hearer is required to contextually analyze the implicated meaning (or to find out whether it is an implicature or "blatant" failure to fulfill a maxim), while in Japanese the final forms such as particles help to suggest the existence of implication expressed by the speaker as indicated in [3-26]. In a sense, this phenomenon implies the importance 103 of final forms in the Japanese pragmatics from the viewpoint of Cooperative Principles. In Kamio's theory, sentence (3-26) B is simply within the speaker's own information territory so that the direct form desu is acceptable, and particle yo is optional. Actually, ne is the only sentence- final particle that matters in Kamio's model. This is reasonable since, among particles, only ne (and possibly na) seems to function to indicate the shared knowledge (e.g. McGloin, 1990, Ueno, 1971). In the same way, the mental space theory defined particle yone as a sign to confirm the sameness of the information which has just been written in the speaker's indirect memory area and the information which already exists in the hearer's memory area. So far, Kamio's model and Takubo and Kinsui's model do not appear significantly different from each other with regard to the function of the sentence-ending forms; they merely have different viewpoints. However, the difference appears in the analysis of ending form daroo (deshoo), demonstratives, and other noun phrases. As is noted in chapter two, the Japanese auxiliary daroo is traditionally said to have two distinctive meanings: one is conjecture (probably) and the other is confirmation (tag-question isn't it? etc.) as in the following examples: (3-27)Uchi no imooto, kirei daro. my POSS younger sister pretty AUX(confirmation) (My sister is pretty, isn't she?) 104 (3-28) Ano hito, ko-nai daroo to omotte -ta. that person come-NEG AUX QUOT I think(STAT)-PAST (conjecture) (I expected that person would probably not come.) Sentence (3-27) shows "confirmation daroo" and (3-28) shows "conjecture daroo". In the mental space theory, since the proposition of a speaker's conjecture is not supported by direct evidence, it should be written in his indirect memory area. Therefore, conjecture daroo is a sign that a proposition is to be written in the speaker's indirect memory while confirmation daroo is the speaker's sign (or command) to the hearer to write information into the hearer's direct memory area since the information which needs to be confirmed is naturally shared by the hearer. The theory specifies that the hearer's information area resides in the speaker's indirect memory area; therefore, in the mental space theory, the auxiliary daroo (both "conjecture" and "confirmation") is characterized as a sign that the speaker inputs his information into his indirect memory space. By doing this, the theory puts the function of the two types of daroo together. I believe that this view is also insightful. The mental space theory seems to be more expandable to other areas of linguistics, but how far it can be applied is not yet known. One problem with Takubo and Kinsui's mental space theory is that only information obtained by direct experience or long-term memory that is stored in direct memory space can be linguistically described in direct forms. This premise of their theory does not meet actual Japanese usage 105 of direct/indirect language forms. In reality, as Kamio clarified, Japanese speakers use direct forms to describe the information which they did not obtain through direct experience but to which they feel they are socially entitled to claim intimacy. The theory of territory of information explains the Japanese concept of direct information well. Also, some phenomena in Japanese that do not conform to the universal evidentiality rules are easy to understand in the framework of information territory. In (3-29), speaker A provided an episode concerning Princess Masako. She made her statement in direct form which caught the attention of her hearers. (3-29) F2: Masako-san, kekkon suru mae ni Princess Masako marriage get before TEMP esute janai kedo, nannka kayotte -ta -no yo. aesthetic NEG but something go (STAT)-(PAST)-n VOC (Princess Masako frequently went to somewhere like aesthetic salon before she got married, I am telling you.) Others: sugoooi .. ..johoo ga extravagant information NOM (What an information source you have!) This is an example in which a speaker evidentially claims that a given piece of information is in her territory although it is not supposed to be. This violation of territory rules was intentionally made by the speaker who proudly announced that she watches almost all midday TV talk shows and became very "resourceful" about popular gossip. 106 Violation of territory rules also occurs in the opposite way. In the following (3-30), by using the indirect auxiliary mitai (it seems), speaker B appears to have reserved her right to claim the ownership of her information: (3-28)A: Go-shujin no kaisha doo? Your husband POSS company how B: Chotto dame mitai. Raigetsu heisasuru-koto ni no-good it seems Next month close COM DAT kimatta -tte. Shujin ga kinoo itteta wa. decided QUOT My husband NOM yesterday said STAT RAPP A: How is your husband's company doing? B: It seems that it is not doing well. I heard they decided to close the company next month. My husband told me yesterday. In (3-30), the speaker, in talking about her husband's business that is closely related with her life, used an indirect form mitai (seem). Her intention can be understood to be modest in respecting her husband information territory. These phenomena of the "assertion of information ownership" (i.e., non-use of socially required indirect forms) as in (3-29) and "speaker's intentional neglect of information ownership" (i.e., non-use of socially approved direct forms) as in (3-30) can be well explained under the assumption of existing information territories. In light of these observations, it seems reasonable to hypothesize psychological information territories which a speaker perceives in interactional spoken discourse. The concept of territory may be only a 107 surface view of Japanese modality but it is very useful to systematize the use of sentence-ending evidentiality. 108 CHAPTER 3: NOTES 1 However, it is true that Japanese speakers do not often "explain" the details of their contention under the assumption that the hearer knows what the speaker is talking about. Thus, an extensive explanation of a topic tends to be considered impolite. This behavior is problematic because it often results in mis-communication. This cultural issue is discussed in chapter seven in relation to the Japanese background of evidentiality markings. A grammatical aspect of Japanese which emphasizes the speaker's delicate concern with the listener is called the "empathy" phenomenon in Japanese grammar. It involves the speaker and listener relationship as an important aspect of, for example, syntax. Kuno (e.g. 1976, 1978, 1987) drew academic attention to "speakerempathetic" phenomena in Japanese grammar. He defined "empathy" as "the speaker's identification, which may vary in degree, with the person/thing that participates in the event or state that he describes in a sentence" (1987:206). Actually, such phenomena are not limited to only Japanese. As an example, Kuno cited the following English sentences which describe a situation where John hit his brother Bill: [3-31] John hit Bill. John hit his brother. Bill's brother hit him. Bill was hit by John Bill was hit by his brother. ?? John's brother was hit by John. * His brother was hit by John. The last two sentences are syntactically grammatical but their acceptability is lower than the others due to the discrepancy between the speaker's empathy and the sentential subject: Kuno argued that the structural subject legitimately receives the highest focus of the speaker's empathy but the phrases "John's brother" and "his brother" 109 are not "empathetic" from the speaker's perspective. Kuno gives five different hierarchies which interact with each other to produce different degrees of acceptability. The following is the summary of his empathy hierarchies: [3-32] The Speech Act Empathy Hierarchy: the speaker must empathize with himself rather than any other person or object; The Topic Empathy Hierarchy: the speaker must empathize with a discourse topic rather than a non-topic; The Descriptor Empathy Hierarchy: between given two descriptors (e.g. 'John' and 'John's brother'), the one on which the other descriptor depends show the speaker's focus of empathy; The Surface Structure Empathy Hierarchy: the subject of a sentence is the focus of empathy; The Word Order Empathy Hierarchy: the left hand NP in a coordinate structure is more readily empathized with than the right hand NP. According to the theory, there cannot be more than one focus of empathy within a given sentence, therefore, if there is a conflict of plural numbers of empathy targets, the sentence will not be acceptable ("Ban of Conflicting Empathy Foci"). This observation might be valid across languages. Based on his series of empathy theories, Kuno explained certain phenomena of Japanese grammar such as the auxiliary use of "giving and receiving" verbs, reflexives and empathy adjectives are empathy- oriented. Kuno's argument emphasized the role of the speaker's subjectivity in producing sentences. 2As introduced in chapter two, Kamio listed and characterized the three major categories of the information which belongs to (A)type (only speaker's) territory as follows: 110 [3-33] (1)Information about direct experience: Information that is obtained through the speaker's direct experience is a central component of information that falls within his territory of information. (e.g.) Watashi atama ga itai. I head NOM ache. (I have a headache.) (2) Information about personal data: (2a) Personal information: Even if a speaker lacks a direct experience, personal information such as family matters falls within the speaker's territory. (e.g.) Kanai wa 46 desu. my wife TOP 46 years' old COP(FOR) (My wife is 46 years' old.) (2b) Geographical information: A subclass of personal information involves those concerned with geographical information which is intimate to the speaker. The following sentence should be expressed as falling in the speaker's territory if the speaker is from Kyoto. (e.g.) Kyoto no jinkoo wa 150-man gurai desu yo. Kyoto POSS population TOP 1,500,000 about COP(FOR)(VOC) (The population of Kyoto is about 1,500,000..) (2c)Information about plans, actions, and behavior Another subclass of personal information. (e.g.) Kore kara Osaka e ikimasu. this from Osaka LOC go(FOR) (I am going to Osaka now.) 111 (3) Information about expertise (e.g.) Travel agent: Pari e wa chokkoubin ga benri desu. Paris LOC TOP direct flight NOM convenient COP(FOR) (To Paris, a direct flight is convenient.) (e.g. ) Professional demographer: Kyoto no jinkoo wa 150-man gurai desu yo. Kyoto POSS population TOP 1,500,000 about COP(FOR)(VOC) (The population of Kyoto is about 1,500,000.) Therefore, in Kamio's model, a direct assertion which falls in the speaker's territory is based on not only the speaker's direct experience but also knowledge from his profession and personal data. The speaker is "socially authorized" to speak about these topics in direct forms. 3 Pragmatic use of auxiliary daroo (tag-question) was first systematically explained by Kinsui (1992) with his mental space theory. Kamio's original model (1990) had four territories of information but he later revised it into one with six "cases" of interaction of the speaker's and the hearer's territories (1994). In Kamio's original model, the auxiliary daroo was not involved as an important form of sentence-final modality. 4 Observe the following example of direct/indirect deixis choice of Japanese shown in conversation between person A and person B; (3-34 ) A:UCLA no UCLA POSS Akatsuka-tte iu PROPER NAME QUOT gengogakushalinguist ga NOM kondisyonaru conditionals to and episutemorogee epistemology no MODI hanashi topic 112 kaiteta naa. wroteSTAT I recall (A: A linguist whose name is Akatsuka at UCLA wrote an article about epistemology and conditionals, I remember.) B: Akatsuka wa episutemikku sukeeru no aatikuru ga PROPER NAME POSS epistemic scale MODI article NOM moo hitotsu at-ta deshoo. more one exist-AUX(PAST) AUX (confirmation) (B: There is another article of Akatsuka's concerning epistemologic scale, isn't there?) In (3-34), speaker A used the quoted expression Akatsuka-tte-iu gengogakusha (a linguist named Akatsuka) implying that A assumed that B does not know Akatsuka. If B did not know the referent as A assumed, B is suposed to accept the indirect modality of the noun phrase for the referent which is assigned to him by speaker A and use it (e.g. sono Akatsuka-tte iu hito [that person named Akatsuka]). But, in reality, B knew Akatsuka, so speaker B in (3-34) did not use the indirect quoted form of the referent, instead she simply used the direct noun form Akatsuka. By doing so, speaker B demonstrated that she knows Akatsuka well and that Akatsuka is in her speech territory contrary to speaker A's assumption, which might have been perceived as being impertinent. In Japanese, a speaker is required to use the deictic as it is introduced to the discourse by his conversation partner until they find that both parties have the same information. I feel that B's act in (3-34B) is nothing but a negotiation of personal speech territory, which I perceive aggressive. If speaker B had desired to be polite, B should have used the quoted indirect expression that A had used, admitted that he knows Akatsuka, and then shifted a different referring expression as in (3-35): 113 (3-35) B: Aa, sono UCLA no Akatsuka--tte iu gengogakusha Oh, that UCLA POSS proper name QUOTcall linguist nara shitteru wa. COND know PART(RAPP) Akatsuka no episutemikku sukeeru wa moo PROPER NAME POSS epistemic scale TOP more hitotsu aatikuru ga atta desho. one article NOM existed doesn't it? (B: Oh, I know that person called Akatsuka at UCLA. Wasn't there another article of Akatsuka concerning epistemologic scale?) In (3-35), speaker B replied using the indirect quoted form of the proposition (linguist called Akatsuka) as introduced by the conversational partner A, not asserting her information territory. By context, B in (3-35) indicated the proposition is shared by both sides. Since (3-35) B used the indirect modality first, it would be considered to be polite by all. Also, speaker A could have been polite in showing that he assumed that the proposition was shared by hearer B from the beginning by using the direct noun without the quotation markers. In this way, the use of deictics presents another important "territory" factor in Japanese pragmatics. 5Whether or not Kamio's concept is applicable to the whole system of Japanese pragmatics is not known. That issue is beyond the scope of this dissertation. However, Kamio (1990) certainly attempted to show that the concept is fairly applicable to wider range of linguistic phenomena in both English and Japanese. He attempted to apply the theory of information territory to various language structures such as sentence structures (e.g. cleft sentence, presuppositional phrases, 114 performative sentence, thetic judgement), nouns phrases (e.g. anaphors, demonstratives), lexical meanings of some words (e.g. come vs. go, this vs. that), and other discourse aspects such as intonation and honorifics. 6Fauconnier's mental space theory: Fauconnier (1985) originated a pragmatic theory of semantics named the mental space theory (espace mentaux). This theory is useful for evidentiality studies in that it deals with the psychological connection between linguistic forms and direct/indirect memories. The theory uses basic mathematical concepts to solve some problematic semantic issues. Fauconnier argued that the central features of language organization depend on their links with other cognitively motivated structures, and that linguistic expressions contribute to setting up connected mental domains. Fauconnier posited that we have multiple mental worlds (or spaces), which are connected with each other, and reflect the real world differently. He said that "Linguistic expressions will typically establish new spaces or refer back to one already introduced in the discourse." (p. 17) He explained that linguistic "space-builders" may be prepositional phrases (e.g. in Len's mind, in 1929, at the factory), adverbs (e.g. really, probably, theoretically), connectives (e.g. if A then B, either A or B), and underlying subject-verb combinations (e.g. Max believes, May hopes). For example, consider the following sentences. (3-36) Susan likes Harry. (3-37) Max believes that Susan hates Harry. According to Fauconnier's theory, sentence (3-36) presents space R (origin="speaker's reality") in establishing relation between Susan and Harry in space R (=Reality). In sentence (3-37), the phrase Max believes is a space-builder which establishes space M. The phrase Susan hates 115 Harry established relation between Susan and Harry in space M which happened to be different from reality. The theory explains that, for sentences (3-36) and (3-37), we must assume two mental worlds, and both worlds are connected with a function called "connector F", and the relationship between Susan (a) and Susan (b) in two worlds is described as F(a)=b. This identified relationship means that both girls are the same person. The theory has relevancy to the study of territory as well as evidentiality in that it argues that a speaker expresses linguistically the space in his mind his information/knowledge belongs to. Fauconnier applied the theory to various linguistic issues: anaphoric pronouns, definite descriptions, assumption, conditionals, comparative sentences, and others. 7I speculate that mental space theory is promising in providing a "deep structure" of Japanese modality usage, while the territory theory provides a sort of "surface account structure". It is true that when we talk to somebody we consistently need to refer to our hearer's knowledge (what we assume they have) somewhere in our memory and linguistically show our understanding of the hearers' changing knowledge in on-going discourse. So neurologically, the mental space model might reflect the biological behavior of our brain. The consequence of this mental behavior, i.e., a speaker's choice of evidential and other modality of each utterance, may be seen as reflecting the model of territories of information as in Kamio's framework on surface. 116 CHAPTER 4: METHODOLOGY Creating a realistic model of the Japanese evidentiality system naturally requires a thorough investigation of the actual use of Japanese evidentials. This study may be considered sociolinguistic quantitative empirical research in that the analysis is genuinely based on data collected from informants' natural everyday speech in various speech situations. I have examined individuals' linguistic performance in my native language and culture. In this sense, I have an advantage in understanding the language user's meanings, both surface and intended meanings, but at the same time, my perspective may lack "objectivity" due to my status as an insider. I tried to be cautious regarding this concern, and have sought out third persons' opinions as much as possible to ensure that my interpretation of informants' meaning is proper. In particular, understanding the speaker's meaning encoded in a subtle difference of intonation (sentence-ending tone, for example) is a difficult task which may produce disagreement even among native speakers. However, the primary judgement of the meaning of informants' speech behavior was performed by myself. DATA COLLECTION Most of the data collection was done in the informants' familiar environment with native culture (i.e., Japan, or quasi-Japanesecommunity in the U.S.A.). The data corpus was collected between 1990 and 1997 but the majority was obtained in 1996. The American sites were 117 primarily Madison, Wisconsin, and Austin, Texas, where I engaged in M.A. and Ph.D. studies. During this time, my primary interest was in discourse analysis; main areas included "tense-alternation", "discourse organization", "Represented Speech and Thought (or RST)" (cf. Banfield, 1982), "speaker's subjectivity and discourse grammar", "common cultural understanding for discourse background", and "hearsay discourse". In performing research on these interests, I collected a variety of spoken discourses (e.g. storytelling, conversational, and interviewed discourse). Since I taught Japanese during this period in both places as teaching assistant and assistant instructor, I became acquainted with a number of Japanese graduate students who were my main informants from American sites. Most of them belonged to, more or less, the same age group (25 to 35 years old), and speech events were generally informal. With the purpose of obtaining more divergent data in regard to speech setting, I spent six weeks in Japan (Tokyo area) in 1996. During this time, I met friends, their families and friends and visited their work-places and other social occasions to acquire an extensive data collection. More informal data was collected than formal data, but I believe that videotaped/audiotaped formal speech events from publicly available speech situations (e.g."TV interview program", "news report show", and "public talk") supply sufficient formal speech data. Informants were from a wide range of age groups: ranging from eight- years-old to seventies. The following table shows the schematic stratification of the informants and quantity and type of speech data I 118 actually used for this research. [4-1] Number of informants: (age) 0-9 10-19 20-29 30-39 40-49 50-59 60-Total Male 3 2 86 4528 Female 3 9 3 11 1 2 29 Students 17* 20* (37*) (* Students' data were not individually analyzed, but were treated as group data.) Recording hours: Audio tapes: approx. 20 hrs Video tapes: approx. 5.5 hrs Number of speech events: Formal group: 14 Informal group: 11 Public: 5 School: 2 Courtroom: 4 Number of speech units examined: approx. 10,700 Number of speech units (i.e., sentences) with clear modality and used for analysis: 7,024 Number of speech unit analyzed: Formal group: 1,993 Friends: 1,904 Family: 1,462 Public: 401 School: 630 Courtroom: 634 Informants are numbered M1 through M28 for males, F1 through F29 for females, and S1 and S2 for two groups of students. (cf. Appendix A). In the above [4-1], the informants are partitioned simply according to biological background information, age and sex. My intention was to collect a variety of speech events which involve different types and degrees of formality created by speech situations including a variety of 119 relationships among the speakers. So overall, information concerning speakers' relationship, such as power difference, is considered to be included in the categorization of speech situations. Speech situations are roughly grouped into six types: "formal group conversation", "discourse of talking to public", "informal group discourse", "family discourse", "teacher and student discourse", and "court discourse". Family discourse is, naturally considered to be "informal", but it is regarded as an independent group based on the speculation that Japanese family members share a strong sense of in-group membership and this might affect the rules of evidentiality within the group. Therefore, "family discourse" and "informal group discourse" are under the overall category of "informal discourse" while "formal group conversation", "talking to public", "teacher's discourse", and "courtroom discourse" are considered to fall under the category of "formal discourse". However each discourse type was analyzed independently due to some observed difference in evidentiality phenomena among the groups. Most informants were well-educated members of the middle class.1 The sample is actually a "convenience sample" given the constraints of gathering field data from familiar people, so that informants are not evenly nor equally stratified. While there may be more suitable groups of informants equally distributed among age groups, I believe that the given group of speakers suffices for the 120 purpose of this research. I also believe that the process of data collection was highly natural due to my function as a participant in a high proportion of the data. It has been suggested that face-to-face interviews are appropriate for quantitative research that requires volume and quality of recorded speech; however, the "experimental effect" is unavoidable in interviews (e.g. Labov, 1984). Fortunately, this was not a serious problem in this research since I was, most of the time, a "participant-observer" in group settings, although I was sometimes an interviewer in initiating talks. With the exception of data collected from public speech, in group discussions, there were often more than two speakers besides myself, and I was familiar with most of the informants. However, it is still undeniable that the act of recording may have caused "recording effect", but I noticed that my informants often forgot the existence of the tape recorder when in a group of people. Part of the data was procured from face-to-face interviews for which the experimental effect can be anticipated. When doing interviews and also when participating in group conversation, when applicable, I used some prepared discourse topics for the informants to talk about. The main concern of this research as an evidentiality study was to see how informants talk about information from different "information sources"; therefore topics were chosen on the spot to elicit utterances about information of both direct and indirect experience of the speaker. In order to elicit discussion of the 121 informants' direct experience, I asked about their work, family, and other things, in the past, at present, and in the future, which seemed most interesting to them. In order to let them talk about issues which are not directly concerned with them, I used social issues of the time. Fortunately for me, but unfortunately for the community at large, at that time, Japanese society had several serious public issues about which people were very well informed: the Aumu-shinrikyoo (Aum-cult) case and the Yakugai-AIDS (AIDS blood serum) case.2 Spoken discourses were tape-recorded with a SONY cassette- recorder TCM-S67V with microphone. Informants' written permission was sought prior to tape recording, and an outline of research purposes was briefly explained to each informant. Since the research topic is fairly linguistically specific, I believe that most of the participants did not pay much attention to my academic interest. I think that their nonchalant attitude to the purpose of my recording worked favorably in that the speech data were not influenced by the speakers' awareness of the purpose of research. Data collection was not combined with more comprehensive long- term studies of overall linguistic performance of the informants since I am familiar with the culture of their speech environment. Therefore, the data are, more or less, "on-the-spot" data. For some informants, data from different speech situations was obtained to see the same speakers' variation of language use in response to changing social factors, but a 122 large part of the collected speech was treated as "speech chunks" to present evidence of linguistic forms (i.e., evidentiality markings) in different speech situations. In this sense, the quantitative part of the analysis of linguistic forms will appear to be fairly mechanical matter of looking for consistency in occurrence of certain linguistic phenomena in certain types of social situations. However, qualitatively, attention was paid to the nature of the speech setting because it was speculated in the research plan that Japanese evidential expressions are under the influence of different kinds of "hearers", while many evidentiality studies (e.g. Palmer, 1986; and Chafe, 1986) suggest that the speaker's experience is the basic and major factor that the speaker relies on to employ evidential markers. We are all aware that even a short conversation can involve all attributes from the speaker, the hearer, and their relationship as well as other environmental factors of the speech (e.g. bystanders and location). As the target of sociological analysis of evidential forms, the hearer's social relationship with the speaker is the issue of analysis, i.e. how distant the relationship of the conversationalists is. Naturally, speakers have different types of hearers. Hearers can be superior (e.g. boss at work) or inferior (e.g. child) to the speaker, or on an equal status with the speaker (e.g. friend), and a speaker must have different "speech styles" respective to each kind of hearer. Theoretical linguistics as well as linguistic pragmatic theories often assume an idealistic speech situation with an idealized addressee, but in actuality, each 123 speech situation may have different rules of linguistic epistemic coding: Perhaps we do not hesitate to say my salary is too low! to somebody intimate to us, but certainly we will be less direct to our superiors and phrase it as, for example, my salary seems to be lower than one would expect judging from reported industry averages. A speaker's epistemology level is marked differently by the choice of sentence modality. Therefore, sentence modality expressions are also a sociological issue of speech environment. In this sense, even though this research is not about comprehensive human speech behavior, it will be able to show us a subset of Japanese speech behavior in relation with social realities through a very small focal point, i.e., linguistic forms of evidentiality. THE DEFINITIONS First of all, formal and informal speech situations need to be defined. The primary subject of this study is to determine how situational features (e.g. types of occasion, speakers' biological and social background, power-relationship between speakers) influence speaker's evidential coding in naturally occurring speech in a variety of formal and informal speech situations. The speech level is usually controlled by the formality factors, in which the speaker's speech style varies along a dimension of formality. It has been pointed out that a formal occasion calls for polite language use (e.g. Shibatani, 1990; Ide 1982). The factors that contribute to formality are various: the nature of 124 the addressee, the perceived formality of the occasion, the nature of the topics of discussion, the nature of the bystanders, and others (e.g. Shibatani, 1990). Formal and informal speech situations are often defined by the use of linguistic features such as syntactic standardness, phonological standardness, morphological fullness, etc. (e.g. Labov, 1972b; Ervin-Tripp, 1972). However, for convenience, I consider in- group speech settings to be informal, and out-group settings to be formal. Discrimination of uchi (in-group) from soto (out-group) is one of the fundamental principles of Japanese social interaction together with the social concept of vertical hierarchy. Historically, Japanese society has been considered to be group-oriented, in which people are conscious of their status as a member of their groups. A group can be any gathering of people such as colleagues at work, schoolmates, club members, family members, couples, siblings, neighbors, and town- dwellers. People often refer to groups they belong to as uchi. Uchi, which is nearly the same as ie, literally means household. A businessman may call the company he works for uchi no kaisha which literally means my household's company. In the same way, a university professor or a university student may refer to his school as uchi no daigaku (lit. my household's university). Sociologists such as Pelzel (1970), Bachnik (1983), and Nakane (1967) argued that ie is not only a kin-based domestic group, but any unit in which social and economic life is involved. This concept of "my group is my household", as a matter 125 of fact, contributed to the development of the Japanese economy through worker devotion to their corporative employers. Interestingly, some sociology studies suggest that Japanese people do not have a solid sense of nationality (e.g. Sakaiya, 1991). This is probably due to the relation with immediate groups being of primary importance. Groups can be small or large, and an individual normally belongs to a number of groups. Some anthropological studies characterize Japanese people as being psychologically comfortable within their groups, and very apathetic to groups they do not belong to (e.g. Nakane, 1967; Doi, 1973). It can be argued that Japanese people are conscious of group territories as well as personal territories, which has the potential to influence language use. Usage of Japanese honorifics in the selection of verbs, nouns, and grammatical forms is often dependent on the relative group membership of the listener, speaker, and referent.3 In this research, the types of groups will involve "family", "close friends", "work friends", and others for informal settings, and '"TV interview", "public talk", "teacher/student interaction", "formal conversation", "courtroom discourse", and others for formal speech settings. One problem that may arise here is that an individual may behave informally in a supposedly-formal setting, or vise versa. Even though Japanese linguistic behavior is significantly influenced by highly structured honorifics, speakers' language use is not completely automatic in a given speech situation. Within an acceptable range, 126 there are variations in situational use of honorifics (e.g. Ikuta, 1983; Wetzel, 1984; Dunn, 1992, 1996). "Affection" between the speakers may override the status difference and realize informality out of formal environment, or "ill feelings" may bring forth an entirly informal- style conversation or ultra formal language. Therefore, to make the analysis simple, alongside with the distinction between objective formal/informal types of speech situation, I paid attention to the formal/informal sentence-ending forms that informants used. Japanese plain sentences for informal conversation end with verbal and adjectival dictionary forms, or copula -da (present tense) and -datta (past tense) after noun and adjectival-noun, or their related forms (e.g. negative forms). Japanese polite sentences end with either verbal endings of -masu (affirmative present) and -mashita (past tense), or the copula forms of -desu (present) and -deshita (past), or their related forms. Usually, these polite sentence-endings are considered to be a form of honorifics known as "performative honorifics" (or "addressee-oriented honorifics").4 When a speaker used polite sentence-endings for most of a discourse, I understood that the speaker considered the conversation to be formal for himself although the degree of formality largely varies. I used this criterion for grouping discourse types. However, I was also aware that one particular usage of honorifics does not indicate a unique social context. For example, plain form speech can be used by a speaker to a lower status 127 addressee as well as to an equal status addressee. Addressee-oriented honorifics (i.e., polite sentence-endings) can be used by the speaker to addressees of lower, equal, and higher level. This indicates that a speaker's decision to use either "plain" or "polite" form involves other factors of "perceived distance" between himself and the speech situation besides the addressee's status. Therefore, it must be true that one particular social context may require one particular level of honorifics (e.g. a formal discussion with equal level addressees requires the speaker to use polite level of honorifics), but the reverse is not always true (e.g. the use of polite level of honorifics does not always indicate that the speaker speaks to his equal level addressees). The following table [4-2] indicates the relationship between speaker-addressee's social-status relationship and the possible use of plan, polite honorifics, and hyper-polite honorifics in spoken discourse: [4-2] Possible grammatical forms of Japanese and types of addressee lower-status addressee equal-status addressee higher-status addressee plain form yes yes no polite form (performative honorific) yes yes yes hyper-polite form (performative honorific) no no yes 128 Therefore, the polite form of honorifics as well as the plain form does not have decontextualized social meanings. This means that a speaker's decision to use the "plain" (informal) or the "polite" (formal) form indicates his integrated perception of the nature of a given speech situation. It is also necessary to clarify the "unit" of analysis. In this research, a "sentence" is regarded as a unit. A sentence is often considered as unsuitable as a unit of speech. For example, in her research on discourse markers, Schiffrin (1987) pointed out that the sentence structure and the meaning of a "speech act" are not relevant to each other, and suggested that "interactionally situated language use is sensitive to constraints quite independent of syntax." Schiffrin concluded that "sentence structure is not the most useful unit to understand language use and social interaction" (1987:32). This may be true for many conversation/discourse analyses on interactional meanings of language use (e.g. turn-taking, silence, hedges, back- channeling). This dissertation is also about interactional language behavior; however, this research views the issue from the sentence form, in particular, from sentence-ending morphological forms. Therefore, treating the sentence as a unit of analysis is inevitable. Unfortunately, spoken sentences are often so incomplete that identifying sentence boundaries is often difficult (e.g. Crystal, 1980). This is a very critical problem in the Japanese language; the sentence 129 ending is often intentionally omitted in Japanese to make the modality ambiguous. The following conversation shows examples of incomplete sentences. (4-3) F5(1): Nani sore . What that? (What is that?) F2(2): Nani gasu tte-iu-n-dakke . What gas QUOTE-n-Q (What was the gas called?) F3(3): Wakannai kedomo, VH toka, dokugasu... don't know but VH(PROP) something like poison gas... [I don't know but poison gas as like "VH"...(incomplete).] F2(4): Nanka sono gasu o sutta dake de moo shin-jau... somewhat that gas NOM inhale only INS soon die-(regret)... [Something like, only inhaling the gas [regretfully] kills people..(incomplete).] F3(5): Dakara chuushaki . o hito no soba de pyutto yatte... so syringe NOM people POSS side LOC ONOM do (te) [So, with one squeeze of syringe beside people..(incomplete).] F2(6): Dakara moo hito so only one tare drop yo. VOC (7): Pon-tte ONOM (dripping) tadrop raseba COND tsono gasu ga hat gas NOM yoosuruni in short nannte how iuno, say kuuchu in the air ni LOC kakusan- sarete.. scatter PASS(te) [So, it's only one drop. If dropped (with onomatopia sound), that gas, in short, how can I say, is scattered in the air ..(incomplete).] In the conversation (4-3), which is informal, sentence (1) and (3) 130 end in nouns without verb-endings. Sentences (5) and (7) end with teforms of verbs that suggest the sentences are not completed yet. As noted in chapter one, te-form of a verb means "action and~" or "progressive action" (e.g. Makino and Tsutsui, 1986) and therefore connotes the "incompleteness" of action or the "state of being"; therefore, it is ungrammatical to end a sentence with te-forms according to Japanese prescriptive grammar. In short, sentences (1), (3), (5), and (7) in the discourse do not have clear modality at the sentence end. This avoidance of the sentence-ending makes "periodless" sentences that produce a "fading-out" effect. A Japanese sentence- ending modality expresses the speaker's psychological attitude toward the context of the speech; he can show, for example, to what degree he commits himself to his statement. Therefore, it seems quite logical to assume that individuals use avoidance of clearly-formed sentence- endings as a strategy to express some degree of reservation toward their propositions (cf. also the case of te-likage in chatper one, note 4). In this research, attention was paid only to completed sentences with sentence ending modalities, although some incomplete endings are exceptionally considered to have modalities as will be explained later in this chapter. THE LANGUAGE In this research, the target vernacular is "standard" Japanese 131 (i.e., Tokyo dialect).5 Even though Japan is geographically a small country (smaller than the state of California), the Japanese language has hundreds of regional dialects. Some dialects are remarkably different from others phonologically, lexically, and morphologically to the extent that communication problems can occur among people from different areas; while other dialects are not very distant from the standard vernacular (e.g. Sanada, 1983; Kindaichi, 1977; Sato et al., 1986). In the Tokyo area, basically standard Japanese is spoken, but regional dialects are also heard.6 As noted earlier, the data from American sites for this study were obtained from Japanese speakers who resided in Madison, WI and Austin, TX. Informants' origin in Japan varied widely. The data collection in the summer of 1996 was carried out in the Tokyo area, but the informants' native dialects were also diverse. In both America and Japan, most of the informants used the standard Japanese, but there were some informants who used their native dialects. If we assume that Japanese linguistic epistemology and culture are related, it is necessary to look into both standard and regional languages to see if their systems of evidentiality marking share the same concepts. However, in this research, standard forms have received the primary focus while the attention paid to regional differences is minimal. An effort is, however, made in this study to make some reference to nonstandard utterances. Non-standard dialect speakers usually learn the standard dialect through institutions (e.g. schools) and other 132 environmental factors such as media and human contacts. All Japanese speakers are assumed to understand standard Japanese, and a large proportion of native speakers of non-standard Japanese are perhaps practically "bidialectal". Since no significant difference was found in sentence modality between native and non-native standard dialect speakers in the data, I speculated that learners of the standard dialect perhaps learn the pragmatic rules of evidentiality coding as a part of the patterns of the Tokyo dialect, or that the major dialects share a common concept of evidentiality marking. If a unique pattern of evidentiality marking is seen systematically in certain non-standard dialect speakers' standard Japanese, it is possible to assume that the phenomenon is a "transfer" from their native dialect. Unfortunately the dialect issue is too far beyond the scope of this study; there are simply too many different dialects and the boundaries between them tend to be fuzzy.7 For these reasons, possible differences in evidentiality coding among regional dialects was not seriously pursued in this research. THE DATA I transcribed discourse utterances with attention to each word, complete or incomplete. Attention in transcription was not paid to phonological aspects such as variation of phonemes, nor most of the aspects of conversational pragmatics such as "timing of speech", "silent or hesitated period", "length of pronunciation", "overwrapping speech" 133 other than "intonational patterns". As to intonational pattern, careful attention was paid to the sentence-final intonation: rising, falling, flat, or other. These intonational distinctions are described in case the pattern affects the evidential meanings of the sentence final forms. For example, Japanese sentence final particle -ne, which often functions to indicate the speaker's awareness of shared status of his proposition with the hearer, is considered to have several different intonational tones (e.g. Oishi, 1985). It was assumed that a subtle tone difference may indicates significant difference of evidentiality meaning reflecting a speaker's cognition of the reality. THE DATA ANALYSIS Scope of analysis As the overall scope of this study is clarified, although there are a variety of evidentiality codings, in analysis, attention was paid primarily to the sentential-ending form which is the main linguistic issue of this research. Other types of modality expressions which involve evidentiality aspects (e.g. "deixis", "adverb", "incomplete sentence", and "hedges") were also analyzed in relation with the sentence-ending forms. For example, occasionally when a sentence- ending form does not involve modality of indirectness or low- assertiveness of the speaker, other types of modality are often substituted to produce a low-assertive mood in the sentence. An example is shown below: 134 (4-4) F3: nannka jyuu-nenn mae no karute ga mada somewhat 10 years before MODI medical chart NOM yet nai-tte sawai-deru. does not exist-QUOT fuss(te-form)-STAT (F3: Somewhat [they] are clamoring saying the medical chart [of AIDS patient] of 10 years ago has not yet been found.) In the utterance (4-4), the speaker's topic belongs to the genre of public information that is not in her information territory. She used the bare direct-ending sawaideru (fussing), without incorporating addressee-conscious final-particles although the proposition was assumed to be known by her hearers. The sentence-ending modality of the utterance may be too direct from the standard viewpoint, but the words at the beginning nannka (somewhat) functions to mitigate commitment expressed by the speaker to the proposition. Other examples with lexical modality of indirection are sentences with adverbs such as tabun (probably), osoraku (probably), and toka-nanntoka (something like that). Syntactically, negative and passive forms are used for the same purpose. Prosodically, changing tones provides a way to do so without making sentence-ending forms less assertive. However, as noted earlier, the sentence-ending form provides the most dominant modality with the sentence (cf. Chapter three). Method of analysis There are three factors involved in the analysis: (1) frequency of 135 occurrence and the type of sentence-ending evidential form, (2) propositional content of the sentence, and (3) speech situation in which the sentence was uttered. Quantitative analysis was carried out through the creation of a database containing a representation of each relevant speech ending in this study. This data was then analyzed by writing a series of computer programs to extract various patterns in this data. The database is conceptually a collection of 7024 speech utterances which have the following information associated: (1) Informant identification (sex, age) (2) Discourse type/group setting (formal conversation, informal, family, courtroom, school, public) (3) Sentence-ending forms (a) Group identification for the forms (1-10) (b) Formal form (polite form)/informal form (plain form) (c) Ascending tone/descending tone (4) Information (proposition) type (A-H) of each sentence The computer programs used to analyze this data were written according to my specifications in PERL Version 5.003 on an IBM RS/6000 workstation running AIX version 4.2.1. PERL was chosen since it is a widely available language with powerful regular expression manipulation and associative arrays. Sentence-ending evidential forms The following [4-5] is a summarized list of the sentence-ending 136 evidential forms for both informal and formal forms that occurred in the data. The completed list of all forms (approx. 350 forms) is in Appendix B. For the purpose of systematic and realistic analysis of the entire data, the list was created based on the theoretical background attributed to each form as well as early-stage analysis of the actual data. For convenience, prior to the detailed analysis, I classified them into ten different groups according to their syntactical and morphological forms. The largest distinction is made among "direct" (D), "indirect" (ID), and "question" (Q ) forms. Direct-ending-forms were further divided into five groups following the types of suffixed sentence-final particles or other final lexical items as well as intonational differences. One group consists of direct forms with questioning tones (DQ "directform question"), some groups involve the direct forms showing the speaker's sensitivity to the hearer's knowledge (SD "semi direct") through tag-question style, etc. Indirect forms are divided into two groups, hearsay and inferential evidentials. Epistemic-auxiliary -ending forms (AUX) and "I think"-type ending forms (THINK) are indirect forms, but grouped separately from the hearsay and inferential forms. In doing so, my intention was also to classify the final forms by their degree of estimated assertiveness. [4-5] Japanese sentence-ending evidentials 8 English equivalent Group 1: D Single-noun-ending, 137 D Direct-form, DIRECT D Direct-form with sentence-final particles such as -yo, -wa,, -sa, -no, and -wake, -kara, -node and related forms. Group 2: D Direct-form with the sentence-ending particles -ne and -na with falling tone (-ne . and -na . ) and related forms. DIRECT(getting attention) Group 3: SD Semi-direct-form with auxiliary "confirmation-daroo . " (falling one) and negative suffix -janai . (falling tone) TAGQUESTION. and related forms Group 4: DQ Direct-Question-form with sentence-final particle-ne with rising tone (-ne .), and "confirmation -daroo ." and negative suffix, -janai . with rising tone. DQ Quasi-question forms TAGQUESTION. NEGATIVE QUESTION. and related forms Group 5: SD Semi-direct form with the particle -ne# TAG (with rising + falling tone) QUESTION (as we both know) and related forms. Group 6: Q Question forms with a question particle QUESTION. -ka, or -no and related forms. Group 7: ID Inference forms such as -mitai, -yoo, and -rashii, IT APPEARS IT SEEMS and related forms. Group 8: ID Hearsay evidential forms such as I HEARD -datte, and -soo, 138 and related forms. Group 9: AUX Epistemic auxiliaries such as -kamoshirenai, -hazu, "conjecture -daroo", MIGHT BEMUST BEPROBABLY and related forms Group 10: ID 'I think" forms such as -omou, I THINK and -kangaeru. In [4-5], most of the D (direct form), Q (question form) and ID (indirect form) endings have both informal and formal forms. For example, the direct affirmative non-past informal ending for to eat is (in context) taberu, and the formal ones are tabe-masu (addresseehonorific), itadaki-masu (humble), meshiagari-masu (hyper honorific), otabe ni narimasu (hyper honorific) and possibly others. No formal/informal distinction is made for sentence-final particles such as yo, sa, na, wa, no, and ne, therefore, when the ending is suffixed with a particle, the form of the verb, adjective, or copula before the particle is either formal or informal. Most ending-forms have a version with the particle -n (or -no ) inserted after the direct forms of V erb, Adjective, or N oun before the ending copula -da (-desu for formal) constituting a V/Adj/N + n+ da cluster. These forms are listed on the right-hand column in the list in appendix B. Particle -no with this function is called the "nominalizing" particle which is claimed to have an evidential function (Aoki, 1986 in chapter two). Kuno (1973) says that patterns of this type of-no da (or 139 -n da) cluster, give some "explanation" for the speaker's propositional context for declarative sentence, and for interrogative sentences, -no desu ka? (with question particle -ka) asks of the hearer's explanation for what the speaker has heard or observed as (4-6) example shows. (4-6) M8(1): naiyoo context wa CONTomoshiroi interesting desu yo. COP(formal) PART(VOC) (2): rabu ni love DAT kansuru relate koto COMP desu kara. COP(formal) because (3): uke popularity o ACC neratte-ru-n-desu aim(te-form)-STAT-n-COP(formal) yo. (VOC) M8 (1): Context [of my dissertation] is interesting (I am telling you). (2): Because it concerns love. (3): [because]I am hoping to be well received [by readers] (I am telling you). (4-6) Utterances are part of the discourse in which M8 was explaining the research topic of his dissertation. In (3), he said that he decided on the topic expecting people's curious attention. This utterance gives explanation for his previous utterance (1): the topic is interesting. The following discourse is an example of a n-da cluster in interrogative sentence: (4-7) M13(1): sore wa chotto ikura sooseiki no terebi da that TOP little even initial-stage MODI TV COP to ittemo amari nai deshoo. QUOT-COND not many exist(NEG) AUX(CONF) 140 F22 (2): uun... maa naku mo nakatta desu Well well exist(NEG) exist(NEG)(PAST) COP(FOR) ne. PART(RAPP) M13(3): aru-n-desu ka . exist-n-COP(formal) Q M13(1): Even though it was one of the initial-stage TV programs, that did not happen often (didn't it?) F22(2): Well, it is not that [the things like that] did not happen. M13(3): Did it happen (as you said)? Here, M13 and F22 were discussing a "live" TV soap drama of some twenty-five years ago in which unplanned replacement of main characters was carried out without informing the viewers. In (1), M13 was thinking that such an occurrence must have been unusual. F22 has more experience in the field and said it was not unusual in (2). M13 requested more explanation from F22 in sentence (3) by simply using the -n-desu-ka? cluster. From the perspective of evidentiality, n-da in M13's utterance (3) suggests that the utterance is based on the evidence, i.e., the utterance (2) from F22. McGloin (1980) further developed this analysis of -n da and argued convincingly that a speaker uses the -n da cluster to subjectively explain, to persuade, to convince or to give background information in a situation where certain information is known by both parties, or either the speaker or the hearer. Kuno and McGloin's 141 analysis can be interpreted to mean that the -n da expression is concerned with (1) sharing information between two parties (from the speaker to the 'ignorant' hearer), (2) checking the truth value of the speaker's information with the resourceful hearer, or (3) confirming the shared status of the information between the two parties. Therefore, -n da clearly functions as an evidential in various ways. McGloin also found that "in purely objective information giving/seeking situation, no desu cannot be used" (1980: 144) suggesting the subjective nature of the particle -no which asserts that the speaker's proposition is supported by evidence. More explanation of group-by-group sentence-ending evidential forms which were summarized in [4-5] are provided below: (Group I sentence-final evidential forms) The first group of the sentence-ending forms (Group I) is assumed to be most direct forms used in Japanese, and accordingly is considered appropriate for presenting any information to which the speaker attaches high truth value. Theoretically, the first listed form, noun-ending, is not a completed sentence ending so it should not be of major concern to this study. However, it was observed in casual conversation, family discourse in particular, that the simple noun- ending was used too often to be ignored. So, I listed incomplete endings with a noun as a kind of direct modality form. Direct ending is the plain forms of the verb, adjective, and copula without any suffix. 142 In conveying information which is truthful from the speaker's viewpoint, however, in many instances, speakers who are sensitive to the existence of hearers may consider plain direct-forms to be too "uninteractional" and add some kind of sentence-final particles or other kind of modality expressions to their proposition to create different types of direct mode. As briefly noted in chapter three, sentence-final particles are hearer-sensitive and, like -n da clusters, are not used in formal Japanese writing or formal public speech which does not assume a specific audience (eg. Saji, 1956). Each particle is said to connote some kind of conversational nuance from the speaker to the hearer. It is very difficult sometimes to translate the meanings attached to the proposition by the use of final particles, so they are often left untranslated in other languages. As noted in chapter three, it is said that the particle -yo, and -sa function to "impart information which belongs to the speaker's sphere to an addressee" (McGloin, 1990), "forcing the speaker's view on to the hearer" (Tokieda, 1951) or "focusing on the informational aspect of the proposition" (Maynard, 1993). Kinsui (1992) said that by using the particle -yo a speaker "declares" his intention to input the information (i.e., his proposition) into his indirect memory which is reserved for the hearer's assumed knowledge (p. 8). Examples of -yo usage are shown in sentence (1) and (3) in (4-6). -Sa is used in the same way as -yo although it probably connotes masculinity more strongly than -yo. -Wa, and -no have been characterized in two different ways: 143 Ueno (1971) said that they have the same function as -sa, and -yo, while McGloin, (1990) considered that -wa and -no create rapport, or request sympathy from the hearer. It seems that -wa and -no are, as McGloin argued, slightly different from "declarative" -yo and -sa. In my analysis, they are not "declaring" but rather "extending" the speaker's rapport to the hearer. However, at the same time, it is also true that -wa and -no particles convey less sense of rapport than -ne. For this research, I included -wa and -no evidentials into Group (1), the category of highly-direct-evidential. Therefore, these Group (1) final particles are generally speaker-oriented. The followings are some examples of -wa, and -no. (4-8) F5(1) : nihon-tte ima nan-nin kurai eizu kanja ga Japan-QUOT now how many people about AIDS patient NOM iru ka shitte-masu . exist COMP know(te-form) formal F16(2) : seikakuni wa wakaranai wa . correctly CONT know(NEG) PART(VOC) F5 (1) : Do you know how many AIDS patient are here in Japan now? F16(2): I do not know precisely. -Wa use by F16 in sentence (2) shows a common usage of -wa in imparting speaker's own state of being. -Wa typically connotes femininity (in starndard dialect), as does -no. It is also difficult to 144 translate the nuance of -wa and -no into English. (4-9) F6 (1): gakkoo ga owatte, minna de atsumatte, school NOM finish(te-form) everybody INS gather(te-form) ja Sakae e ikoo ze -tte koto ni then Sakae DIR go(VOL) PAR(VOC) QUOT COMP DAT natte minna de jitensha de kuridasu no. become(te-form) everybody INS bicycle INS crowd to (2): de machi e itte, chika-gai ga aru no. then downtown DIR go(te-form) underground mall NOM exist (3):soko e haitte, shabekuru to iu.. . there DIR enter(te-form) chat QUOT (4): sorede ie e kaette syukudai o suru no. then house DIR return(te-form) homework ACC do F6 (1): After school, [we] all gather, and decide to go to Sakae, and everybody goes by bicycle (no ) (2) Then, go into the town, there is an underground mall (no ). (3) [We] go into there, and talk, (4) Then, [we] go home and do homework (no). In (4-9), the speaker explained what she habitually did in her high school days, therefore, naturally her commitment to the proposition is very high. -No ending is used in (1), (2), and (4) sentences. I have included -kedo (and -ga) (meaning but) and -kara (and -node) (meaning because ) as sentence-final forms although they are not usually considered to be so. They are conjunctions and if a sentence ends with one, the sentence is, grammatically speaking, incomplete. 145 However, the original meanings of these conjunctives are often ignored, and they are used to end a sentence in a fading-out fashion without clear direct modality. Since, utterances ending with one of these conjunctives do not usually entail the hearer's knowledge but simply muffle the directness of the utterance, I included these in the direct-ending group. An example of -ga use is shown below: (4-10) F24 (1): eeto chiryoo houhoo no minaoshi o nasatta well, treatment method MODI reexamination OBJ did(HON) ka dooka to iu koto ni tsuite ukagatte mitai whether QUOT COM regarding ask(HON) try(DES) to omoimasu ga. COM think(formal) but (2) jiko chuushahoo o hikaeru desu toka ne, self injection method OBJ refrain COP(FOR) like RAPP kurio e no kirikae, shinsenna toketsu kesshoo o domestic medicine DIR MODI change fresh frozen serum ACC katsuyoosuru toka desu ne , utilize like COP(FOR) PART(RAPP) dooiufoona koto o gutaitekini nasai-mashita ka . what kind thing OBJ practically did(HON) Q F24 (1) Well, I would like to ask if you reexamined your treatment (kara ). (2) What sort of thing did you do actually in terms of reexamination of treatment of hemophiliacs, such as refraining from self-injection, use of domestic blood, utilization of fresh frozen serum? 146 The use of -ga in above (1) does not have any particular meaning. It helps to give an impression that the sentence is less declarative. This is the same as the use of sentence final -kara (because) or -node (because) as shown below: (4-11) M1 (1): yuushuuna jinzai dattara oyakusho de mo excellent human resource COP(COND) government LOC also kigyoo de mo onajiyooni kyosoosite hippattekuru-tte company LOC also alike compete(te) recruit-QUOT iu no ga soo iu koto ga atte shikaru-beki na-n-da. COMP NOM such COMP NOM exist(te) should -n-COP F5 (2) soo desu ne. . so COP(FOR) PART(RAPP) M1: (3) shikamo amerika no baai wa ne . moreover America POSS case CONT PART(RAPP) dentootekini yakunin ni nattara kyuuryo ga historically civil servant DAT become(COND) salary NOM sagaru-tte iu no ga aru kara. decrease-QUOT COMP NOM exist because (4): futuudato maa sukunatutomo ne, . usually well at least PART(RAPP) daitooryoo ga kawaru tabi ni ue no renchuu-tte President NOM change turn TEMP top MODI people-QUOT iu no wa kubi o sugekaerareru-tte. COMP TOP neck OBJ replace(PASS)-QUOT. M1(1): If [they are] excellent staff, the government and private companies should compete to recruit those people. F5 (2): It is so. M3(3): Moreover, (kara) in case of America, traditionally, one's income decreases if he became a civil servant. 147 (4): Usually, well at least, each time a new president is selected, high class officials are said to be replaced. The sentence ending with -kara in (3) does not denote its literal meaning, because: there is no phrase or sentence to be meaningfully connected with sentence (3) with the conjunctive -kara. Therefore, when talking about American politicians, a topic which is supposed to be other people's information, the speaker used -kara, thereby avoiding the bare direct-ending of the verb, -aru (exist) in (3). The ending form -wake (literally reason) functions in a similar way with -n da in extending "explanation" from the speaker about his propositional background: (4-11) F3(1): gakuhi ga zero. tuition NOM zero F5(2) : zero. ii wa ne#. zero good PART(VOC) PART(SHARE) F3(3): daigaku made zero yo. university till zero PART(RAPP) F5 (4): sore zenbu zeikin . that all tax F3 (5) : zeikin tax M22(6): sono kawari josei mo yamenakute sumu yooni that instead female also quit(NEG)(te) settle in such a way hatarakeru kankyo-tte tukutte -aru wake. work(POT) environment-COMP make(te)-RES reason 148 F3(1) School tuition is free F5(2) Free? That is good. F3(3): It is free to university (I am telling you). F5(4): That is all [paid by] tax? F3(5): Tax. M22(6): Even though [Swedish people have to pay high tax], the environment is well-conditioned to allow females to continue working (that is the background of high tax). Wake as used in M22's utterance (6) performs the function of explaining that the utterance is giving the background information for what has just been said. The degree of evidentiality attached to wake- ending seems high. Combined forms of Group (1) evidential ending forms, such as wayo and wakesa, also belong to this group. (Group 2) Group (2) final forms typically involve the particle -ne. Ne and -na are said to be used to "seek confirmation from the hearer" (McGloin, 1990), or "solicit confirmation" (Maynard, 1993), but at the same time, ne, and -na, function to create rapport, or request sympathy from the hearer (e.g., McGloin, 1990 Tokieda: 1951) or interpersonally to "solicit emotional support" (Maynard, 1993). It is noted that each of the particles -ne and -na is affirmed to have two different functions: "requesting confirmation" and "requesting/sending rapport". However, how these two functions are 149 linguistically distinguished has rarely been discussed. The prosodic features of sentence-final particles seem to have been rarely investigated other than by Tanaka (1973, 1977) and Oishi (1985). Oishi argued that intonational patterns determine the different functions of the particle -ne. He pointed out that -ne (and yone) can be uttered with four different tones: (1) the pitch of the final syllable of the word preceding the final particle nee is lower than the pitch of its first vowel /ne/ and the pitch of this vowel is higher than the second /e/; (2) the pitch of the final particle is higher than that of the final syllable of the preceding word in one syllable particle; or the pitch of the final syllable of the final particle is higher than that of the penultimate syllable; (3) the pitch of the final particle is lower than that of the final syllable of the preceding word in one syllable particle; or the pitch of the final syllable of the final particle is lower than that of the penultimate syllable, (4) no pitch differences between the two identical vowels (/e/) in nee. Oishi argued that the discourse meaning of each ne is different. He referred to only ne and yone, but this observation must apply to other ne-related final particles (e.g. wane ) and the particle na which is slightly vulgar version of ne. (p. 60) Four different pitch types were also confirmed in my data. 150 Taking Oishi's distinction into consideration, I assumed three types of ne in my model as described below: 9 (a) Ne.: Ne with a falling intonation, which is not necessarily asking for either confirmation or agreement from the hearer, is simply placed by the speaker between phrases or at the end of sentences to make utterances interactive by requesting attention and rapport from his hearer or by mildly asserting the speaker's contention. So logically, and also empirically, a speaker can insert this ne after every word or phrase of his sentence. (4-12) F12(1): nanka ne . aakansoo ni ita toki ni ne . something like Arkansas LOC lived when TEMP ano hito ne . gabanaa ka nanka datta desho. that person Governor something like COP(PAST) AUX(CONF) (2) : sono toki ni ne . sekuretarii datta to omou kedo ne . that time TEMP secretary COP(Past) QUOT think but maa, chotto bijin no ko ga ite ne . well a little pretty girl MODI girl NOM exist(te) F12 (1): something like (ne), when he was in Arkansas (ne), that person (ne) was the governor or something like that (wasn't he?) (2): At that time (ne), I think that was his secretary (ne), well, there was a cute girl (ne). Characteristically, this -ne seems to be related with the information that belongs to the speaker's territory, and is not known by the hearer. I call this -ne "rapport -ne". Some small proportion of 151 the speakers habitually pronounce this -ne as a short rising sound. This use of rising -ne functions as if the speaker is asking "Are you listening?" to the hearer in conveying information that is likely unknown to the hearer. This rising version of "rapport -ne" is easily distinguishable from the real "rising -ne" (the second type -ne) because it obviously does not involve the speaker's concern about the hearer's knowledge. I decided to group these two types of "attention-getting -ne" into the same category because the evidential function of the both ne's is the same: to get the hearer's attention or sympathy to his proposition. Falling -na has the same function as the falling rapport -ne. (4-13) M1 (1): friitaa. "freeter" (self-employed person usually working independently) (2): friitaade ne. freeter-(te form) PART(RAPP) (3): de, kekkyoku syuushoku mo sezu ni ne . then after all get a job even do(NEG)-adverb PART(RAPP) jyuu-nen bakari asonda-n-da na . 14 years about had leisure-n-COP PART(VOC) (4) : nanka kissaten no keiei ka nanka somewhat coffee shop MODI management or something yatteta-n da na. did (te-form)STAT -n COP PART(VOC) M1 (1): "Freeter." (2): [He was] a "freeter" (3): Then, after all, he did not get a solid job [as every university graduate does immediately after graduation] and had a leisure time for about 10 years (na). 152 (4): [He] did something like managing a coffee shop (na) (this refers to my previous utterance). In (3) and (4), -na is used sentence-finally. The speaker was talking about a Japanese author's personal information. The use of -na in (3) and (4), as well as ne. in (1) and (2), suggests that the speaker assumed the hearers did not know the information (so he was informing the hearer of what he knew). Now, we turn to the second type of -ne. (b)Ne .: Ne with a rising intonation is often used by a speaker to ask for confirmation on the truth value of his proposition from the hearer. Therefore this -ne is often used for the proposition which is assumed to be known by both parties. This -ne often sounds like a question because the speaker's surface intention is to ask for the hearer's agreement. The major evidential function of this -ne is to confirm that both parties have the same information in either one's information territory or simply as knowledge. I call this ne "confirmation -ne". In the following example, a school teacher was asked by a student to change what the student had written on the board, and the teacher changed the writing and then tried to confirm her understanding of the student's meaning: (4-14) F25: koo iu fuu ni kakikaeru to iu koto desu ne . this way like rewrite QUOT COMP COP(formal) CONF F25: It is said to rewrite this in the way like this (am I right?) 153 In a sense, this -ne . functions in a similar way as the question particle -ka. The difference between the two is that -ka is used for a question for which the speaker is supposed not to have an answer; the proposition is not in the speaker's territory or knowledge. Next -ne (the third one) involves the hearer's knowledge more deeply than the confirmation -ne. (c) -Ne#: Third type of -ne is the one with an intonation that first rises then falls, and usually pronounced longer that (a) or (b) type -ne , or with a flat prolonged intonation without falling. This -ne is characteristically used to end the proposition which the speaker knows to fall into both parties' information territories. I call this -ne "sharing-ne". From the viewpoint of discourse management, this -ne functions to send the sense of camaraderie, or in-group intimacy in sharing information, and functions evidentially to show that the truth value of the speaker's proposition is fully acknowledged between both parties. In the following example, (4-15), the speaker and the hearer were talking about the hearer's shadow-picture products, and since they were both observing these products at the time, they were actually sharing the same experience which enhances the use of "sharing-ne# :" (4-15) F22: kore wa ari desu ka. Ari to kirigirisu no this TOP ant COP(FOR) Q Ant and grass-hopper MODI o-hanashi desu ka . HON-story COP(FOR) Q 154 kore nannka mo zuibun komakai desu ne# this also very fine COP PAR(SHAR) F22: Is this an ant? Is this the story of ant and grass-hopper? This one is also very finely-cut (as we both can see). Although Kamio (1994) emphasized the importance of -ne as a pragmatic discourse marker, he discussed one general -ne which is obligatory when being used for information that belongs at least to the hearer's territory. Takubo and Kinsui's theory also considered -ne as one general concept, in that -ne confirms the sameness of existing information in the speaker's memory and the hearer's memory area, i.e., type (b) and (c) -ne in this study. However, considering the concept of evidentiality coding, these three types of -ne must be differentiated. There are individual differences in -ne pronunciation and some people prefer one type of -ne over others regardless of the propositional type. However, generally, it seems that a high proportion of informants had these three types of -ne. Each type of -ne was often used independently as if it were deictic and representing the sentences which were spoken before. Observe the following examples: (4-16) F5(1): de souru daigaku-tte arimasu -deshoo . then Seoul Univ. QUOT exist(FOR)-AUX(CONJ) maa kankoku no toodai, asoko ni hairu no wa well Korea POSS Tokyo Univ. there DIR enter COMP TOP 155 kankoku de wa ichiban Korea LOC CONT primary no MODI eiyo honor rashikute AUX (te)(it seems) F18(2): un yes un yes rashii seems F5(3): ima, now nihon wa Japan CONT soo de mo nai-deshoo . so COP NEG AUX (CONF) sorehodo such degree de COP mo mukasi old times hodoja-nai degree(NEG) to COMP omou-n-desu. think-n-COP(FOR) nannka somewhat sugoi extreme mitai. seem jisatusha mo ooi suicide also mitai. many seem F18(4): nee# F5(1) Then, there is a university called Seoul University, as you know. It is like Tokyo University of Korea, it seems very difficult to enter there, F18(2): Yes, it looks like so, F5(3): Isn't Japan as bad as before (regarding the entrance competition into the Tokyo University)? I think the situation is not so bad as old times. It seems that (competition to enter the univ in Korea) is very hard. It looks like there are a lot of suicides. F18(4): nee# (Yes I agree it does so.) In this conversation, in F18(4), the speaker uttered "sharing -ne" only meaning she shares the information presented by F5(3). "Sharing-ne" represents Group 5 endings. Group (2) ending forms are mostly "rapport -ne" and its related forms. 156 (Group 3) This is a group of semi-direct (SD) forms. Important forms in this group are the auxiliary "confirmation-daroo." (-deshoo. in polite form) with falling intonation which is almost equivalent to English tag- question, isn't it., in effect, and -janai. (or dewa nai) with a falling intonation which also functionally similar to English tag-question, isn't it. (4-17) F1 (1): video wa itsu miru no . video CONT when watch Q F2 (2): watashi yoru nechau hito dakara, I night sleep(regret) person because video mitete mo nechau kara. video watch(STAT) also sleep(regret) because (3): Un, dakara, asa Yeah, so morning miru watch no. PART(VOC) (4): de doyoobi then Saturday wa CONT okeiko teaching ga NOM atta- ri suru kara exist-(etc.) do because kekkyoku asa hayaku okite, osooji toka-tte iroiro after all morning early rise(te) cleaning etc-QUOT various shinakya naranai desho. . do-obligation AUX (CONF) F1 (1): When do you see videos? F2 (2): Because I sleep (early) at night, I fall asleep even when I am watching movie videos, (3): Yes, so, I watch them early morning. (4): Then, on Saturdays, I have students or something, therefore eventually, I wake up early in morning and 157 have to do laundry and other things, (don't I .) F2 in (4-17) talked about part of her life-style: she watches movies in the morning. Since, the proposition is her own information, she did not need to ask for hearer's agreement on it; the direct sentence-ending for (4) is perfectly acceptable. However, how F2 spends Saturdays as a house-wife who teaches flower-arrangement on those days is not beyond the hearer's imagination given the fact that F(2) and her listeners are close friends. Moreover, doing laundry and cleaning in the morning (everyday) is a well-shared Japanese wives' daily schedule. In this way, "confirmation-desho." is often used to express the speaker's information which may be known by the hearers. Negative suffix -janai seems to be used in the same way as in (4-18): (4-18) F7: (1) tonari ga juuniji kara sutereo ookiku kake-dashita next door NOM 12a.m. from stereo loudly play-started no ne.. PAR(VOC) PAR(RAPP) (2) urusai toka omotte, jibun de iu no mo sankai me toka noisy like think(te) myself INS say COM three times like yonkai me toka onnaji koto o iu no iya -janai .. four times like same thing ACC say COMP don't like- (NEG) (3) dakara furonto ni denwashite ano urusai-n desu yo so front DIR call(te) well noisy-n COP(FOR) VOC nannte ittara "We'll send somebody up" toka itta something like said(COND) like said kara sutaffu ga kuru no ka na toka omottara because staff NOM come COMP wonder like thought(COND) 158 ikinari suddenly don don bang bang toka like itte say(teomawarisan ) police officerga NOM kicchatte. came(regret) (4) majison poliisu. Madison Police (5) de watashi ga repootoshiteru janai . then I NOM report(STAT) (NEG) (6) tonari no heya no ruumu-meito o. next door MODI room MODI room-mate ACC (7) watashi-tte meen-janai. I -QUOT mean-(NEG) F7: (1) My next door neighbor started to listen to music loudly from twelve midnight (rapport -ne). (2) I thought it was noisy or something, it was embarrassing to complain three, four times [to the neighbor] myself (isn't it.). (3) So, I called the front desk [of the dormitory apartment] and said [the neighbor was] noisy, then [they] said "we'll send somebody up" or something, so I thought the staff might come, then suddenly, bang bang bang [at the door], then policemen came (te-incomplete). (4) Madison Police. (5, 6) Then, I am the person who reported on the roommate (aren't I .) (7) Aren't I mean? In explaining how she reported on her own roommate to the police in effect, the speaker used -janai. (isn't it.) in ending sentences which the hearer can reasonably identify with himself: it is 159 understandable that to complain repeatedly is embarrassing (sentence 2), and the hearer has already been informed that the speaker is the person who reported the case (sentence 5). -Janai. is the contracted form of de-wa-nai (S + copula + contrastive + negative). Although this form does not function to negate the proposition which it is attached to, its surface syntactical structure implies that S (i.e., proposition) is understood information among conversationalists. Group (3) ending forms are called "semi-direct form"s (SD) in this research. (Group 4) The "rising-ne " belongs to the Group (4) sentence-ending forms which generally are used for expressing the speaker's intention to request the hearer's agreement. "Rising -janai." (isn't it. or negative question) and "rising daroo." (isn't it.) give the impression that the speaker is asking a question to the hearer. These forms are also semi-direct forms, however, since the forms of this group are direct with an obvious questioning intention of the speaker, I call thee forms "direct-question forms" (DQ forms). Therefore, the ending forms in this group are likely used as evidentials to propositions which are known by both parties. An example of this rising -janai. is seen in (7) of (4-18). It is different from the falling -janai in the same discourse. In (7), the speaker is really asking if the hearer agrees to the proposition that the speaker is mean. The following discourse shows a case of rising deshoo. 160 usage: (4-19) F27 (1) eizu happyoo shita hito sukoshi wa AIDS announcement did person little CONT enjosareta-n deshoo . helped(PASSIVE)-n AUX (CONF) (2) sorede jibun ga eizu-datte juukyu-sai no nantoka -iu... then oneself NOM AIDS-QUOT 19 years old MODI somebody-QUOT F5 (3) Aa, sono hanashi yonda. Yeah, that story read(PAST) (4) otoko-no-ko deshoo . boy AUX(CONF) (5) ano hito nannka kawaisoo janai . that person somewhat pity NEG F27 (1) Those people who declared that they caught the virus (from the blood-forming medicine) have been helped at least a little, haven't they? (2) So 19-years old one said he has AIDS. . F5 (3) Oh yes, I have read that story. (4) That is a boy, isn't he? (5) That person is, somewhat, miserable, isn't he? I believe that the argument that ending forms with rising intonation (i.e., -ne., -deshoo. and -janai.) without question-particle (ka?) belong to this group is intuitively appealing. The speaker uses the rising tone to ask if his proposition is right in light of the hearer's knowledge but he does not use -ka because it is not a genuine question; the speaker also has the information. 161 Also sentence-medial or final use of rising intonation, which I call a "quasi-question" is included in this group. Lately, sentence- medial and final rising tone of phrases/words in declarative sentences are very popular among young speakers. A good example is (1-1) the discourse excerpt cited at the beginning of this dissertation: (1-1) F2 (1): A, soo. Well so (2) : ano hito ga ichiban nan-te iu no ., yoosuruni tsukutta . that person NOM most how-COMP Q in short made (3): sarin o sukutte yoosuruni jibun de maita-tte iu ka. Sarin OBJ make(te) in short oneself INS scattered-COMP or (4) : yoosuruni kagakusha . in short scientist (5) : hotondo ga daigaku no toki ni soo-iu bunnya o most NOM university MODI time TEMP so-QUOT field ACC senmon to shite yatteta hitotachi . da kara tabun major DAT make(te) did people therefore probably tabun-tte iu ka yoosuruni kenkyuu . probably-QUOT or else in short research F2(1): Well, it is so. (2): that person did, the most, what shall I say, in short, made (Sarin gas)? (3): He made Sarin, and, in short, shall I say he scattered himself? (4): In short, a scientist? (5): Most of them studied that kind of field as their major in their university days, so probably, shall I say probably, in short, research? 162 F2 used a rising intonation at the ending of phrases and sentences which makes the declarative sentence sound like a question without an explicit question marker -ka (i.e., sentence final -ka in Japanese). But the speaker was not posing questions. This use of rising intonation at the end of, and also within, a non-question sentnece is novel among speakers of Japanese.10 The phenomena was very new to me in 1996, so I had opportunities to discuss this issue with my friends in Japan. It seems that a speaker uses a rising tone for his sentence or some words within the sentence to express, on the surface, that he is not confident in his proposition or selection of lexical items. I understand that this "untraditional" rising tone produces an effect of modesty; with the rising tone, the speaker pretends to ask his hearer's agreement to what he is saying. In this sense, the quasi-question sentences or phrases are substituting the traditional sentence-ending such asjanai., or -deshoo. 11 At least this new "fad" phenomenon indicates that intonation can be an evidential marker. (Group 5) Group (5)'s main ending-form is the "sharing -ne#" which is most likely used as an evidential for fully shared information among speakers as noted earlier. Usually, a sense of camaraderie is emphasized in the use of ne. The forms in this group are semi-direct forms (SD). 163 (Group 6) Group (6) contains question endings which involve the question particle, -ka (polite sentence) and -no (casual sentence). Some question forms with falling intonation are not pragmatically intended to be questions to the hearer. It seems that the speaker uses these falling-tone question endings to pretend to be modest enough to ask the hearer's judgement of the truth value of his proposition. Question sentences with a rising tone are normally seeking for the information which the hearer is assumed to have. Therefore, Group (6) ending forms are likely to be used for the hearer's information that is not known to the speaker. (Group 7 and 8) So far the sentence-ending forms are all direct except questions. Groups (7) and (8) consist of indirect sentence-ending forms (ID). -Mitai (it looks like), -yoo (it appears to be) and -rashii (it seems) are the forms for inference (Group 7). (Da)tte (I heard), -soo (I heard), -to kiita (I heard), -to iwareta (I was told), -to iu hanashi (It is said), and others are all hearsay expressions (Group 8). (Group 9) Group (9) represents sentence-ending forms using epistemic auxiliaries of necessity and possibility (cf. chapter two). Kamoshirenai (it might be), hazu (it must be), ni chigainai (it must 164 be ), and "conjecture daroo" (probably) are used to indicate the possibility that the proposition is true, in that the speaker makes subjective judgement based on some kind of evidence. As well as the evidentials of hearsay and inference, epistemic auxiliaries are instances of the combination of structural and lexical expressions of evidentiality; while group (1) - (6) ending forms are morphological expressions of evidentiality. Therefore, these auxiliaries are often followed by particles and other sentence-ending forms, either direct or indirect, to allow those suffixed forms to bear the final sentence modality. Therefore, only direct- and semi-direct-type endings of auxiliaries (e.g. hazu desu, hazu yone., and hazu deshoo.) are listed and investigated to see the speakers' use of these subjective items; auxiliary forms with indirect endings (e.g., hazu mitai) were included in the forms of indirect endings in Group (7). (Group 10) Group (10) is I think expressions including -to omou, -to kangaeru, -to rikaisuru, and others. As the existence of -to (quotation) before the expressions suggests, most of these expressions are usually used as matrix verbs in complex sentences. These forms are treated as indirect sentence endings although the expressions show the speaker's subjective judgment as same as Group (9) evidentials. To see how directly or indirectly the informants handle information through these subjective indirect expressions, these items were separated. 165 The occurrence of these sentence-ending evidential forms of ten groups were analyzed in relation with two factors: types of speech situation, and propositional content of the speech including the speakers' age and sex. In this research, I argue that the hearer is important in two distinct aspects: the hearer's knowledge about the speaker's proposition is crucial for the speaker's choice of evidentiality, and the hearer's social relationship to the speaker is also crucial for the speaker in order for him to use the evidentiality markings to show appropriate politeness. The hearer's knowledge of the speaker's proposition is considered as the distance of the proposition from the hearer and the speaker. Do they both know the proposition very well? Is it public information? Is it the speaker's personal matter that he can commit himself to? Is the speaker talking about the hearer's matter? and so forth. The speaker may employ evidentiality expressions of different degrees of certainty in each situation considering the hearer's psychological distance from what he is presenting. Therefore, it is necessary to classify propositional context for the purpose of analysis. Proposition types At the first stage of the analysis, the occurrence of the forms were analyzed in relation with the types of propositions, i.e., to what degree the speaker commits himself to the proposition's truth value. My 166 grouping of propositions of sentences is largely based on the concept of information territory of the speaker and the hearer. I grouped all propositions into basic six different groups: [4-20] Proposition types for direct and indirect evidential forms Proposition for direct evidentials (A)information that is in the speaker's information territory, that the speaker assumes the hearer does not know (B)information that is in the speaker's information territory, that the speaker assumes the hearer knows (C) information that is in the speaker's information territory, that the speaker assumes also falls into the hearer's territory Proposition for indirect evidentials (D) information that is in the hearer's information territory, that the speaker does not know (E) information that is in the hearer's information territory, that the speaker knows (F) information out of both speaker's and hearer's territory (G) public information (G) type propositions were included in the category of (F) type information at the beginning of the research, but were later separated for experimental purposes. (A) to (F) are the basic six propositional types in this research. This stratification of proposition types is based on empirical and theoretical analysis of the data. In my 1993 study, I looked into discourse 167 data and confirmed that Japanese informants had unconsciously conformed with the rules of information territory and used different sentence-ending forms as suggested by Kamio (1987, 1990). At that time, as noted earlier in chapter two and three Kamio's early model has only four cases of interaction of information territories as [4-21] show: [4-21] Kamio's original concept of four information territories for a speaker Inside the hearer's territory Outside the hearer's territory Inside the speaker's territory TERRITORY A (information belongs to both speaker's and hearer's territories) direct+ne form TERRITORY B (information belongs only to the speaker's territory) direct form O u t s i d e t h e speaker's territory TERRITORY C (information belongs only to the hearer's territory) indirect+ne form TERRITORY D (information is out of both speaker and hearer's territories) indirect form In Kamio's earlier model, each territory was assigned a single surface sentence-ending form as shown in [4-21]. Such an anlysis was confirmed in my 1993 and 1994 studies that the Kamio's model basically reflects reality, but there were findings which did not agree with this theory. The major disagreements and additions were as follows: [4-22] (1) For territory (A) information, not only the form "direct + ne" 168 was used as expected by Kamio, but also deshoo (tag-question), and -janai (negative tag-question) and other related forms were used by the informants. (2) For territory (B) information, which Kamio claimed is the only case in which the simple direct form is possible, male informants used simple direct forms generally as expected while female informants used direct forms with sentence-final particles such as ne (information sharing), yo (informing), and n-desu (explaining). These are addressee-oriented particles; therefore, it was suggested that the female speakers may have greater consciousness of the presence of hearers. (3) For territory (C) information, for which Kamio assumed indirect forms with ne form are appropriate, questioning forms and janaino. (negative tag question + questioning), were used by the informants instead of "indirect + ne" forms. It was also noted that this janai was different from the ones for territory (A) information; the use of janai for territory (C) information was observed with rising intonation. (4) For territory (D) information, for which indirect forms with ne were expected, informants used simple indirect forms and question forms rather than the expected indirect plus ne forms. (5) Analysis of family discourse showed that more direct forms were used among family members regardless of information territories. (6) Data from formal interview discourse suggested that, in formal situations, speakers unanimously did not use simple direct forms at all in talking about information that belongs to their own territory; ne related forms were preferred. 169 (7) Kamio assumed that English speakers have a different concept of information territory; he argued that in English there are only two information territories, the speaker's territory and others; that is to say, English speakers do not care about the information territory of the hearers. However, my data suggested that English speakers also have a concept of hearer's territory and shared status of information between the speaker and the hearer. For territory (A) information, native English speaking informants used indirect forms in more than 62% of utterances, and for territories A and (B) information for which Kamio expected only direct sentence forms would be used by English speakers, some kinds of indirect forms were used in more than 70% of the utterances analyzed. Therefore, basically, English and Japanese may have a similar concept of information territory. (8) However, English speakers treat "public information" as everybody's information and used direct mode. This was a significant difference between the two cultures. The results of these earlier studies suggest the possibility of different concepts of information territories between males and females, in-group members and out-group members. The studies also showed that the relationship of the propositional content and sentence- ending forms is not as simplistic as Kamio expected, suggesting that more finely sectioned information territories may exist in the Japanese speaker's mind. In these pilot studies, I analyzed data based on Kamio's categorization of information territory (four territories), and I now think the method that I used could be misleading; in doing analysis, the 170 possibility of the existence of other territories, or other types of interactions between the speaker's and the hearer's knowledge could have been ignored. In fact, as introduced in chapter three, in his 1994 revisional paper, Kamio proposed eight cases in which the speaker's and the hearer's information territories are differently interrelated. He added two more surface sentence-ending forms which represent a new concept of speaker/hearer territory interaction with daroo forms. The usage of daroo is actually found in my 1993 study, but I plainly concluded they are an extension of direct forms since the study was centered on Kamio's framework and I did not clearly see the implication of the use of daroo (tag question/negative question) by my informants. Based on this retrospective thought, for this dissertation, I desired not to limit my analysis within existing frameworks laid out by either Kamio or other evidentiality studies, but at the same time, it is hardly practical to analyze the relationship between the sentence forms and the evidential context of the speaker's proposition without some framework which provides a way to "sort out" propositions into different categories. Thus this time, I first went through one part of the data, and examined the relevance of Kamio's newer version framework (1994) to the data. I have gained some results through this process, and constructed my original model, and examined more data which resulted in more modifications of the model. I repeated this process a few times, and finally reached my final model. I believe that this method worked better than an approach in which the framework of an evidentiality 171 system is first decided on and next the forms of evidentials are sorted from the data. Therefore, an attempt was made to examine the data without the restriction of existing theoretically hypothesized frameworks. In this sense, again, the method of analysis and the data analysis itself are interwoven at the first stage of this research. The more detailed process that has led to the above categories of propositions [4-20] is explained in the next chapter. An analytical problem The crucial analytical problem, however, is how judgement of the propositional type of an utterances is correctly performed. This is not an interpretation problem of the speaker's "meanings", but a problem of judging how much the speaker should assume the proposition is known/shared by his interlocutors. The speaker's proposition (or information) types are categorized upon the assumed status of informational content of the proposition in the speaker's, the hearer's, or both parties' information territory or knowledge. In order to precisely determine how a given proposition is identified among conversationalists, it is necessary to know the nature of the proposition and how much each participant is supposed to know about the proposition. This is not very difficult if one is in the discussion and able to observe the reaction of the hearer to an utterance and the subsequent reaction of the speaker to the hearer's reaction. However, since I can not represent every informant's memory, sometimes, the judgement is 172 difficult. Oishi (1985), who investigated Japanese final particles based on the theory of "linguistic particularity" (Pike, 1982; Becker, 1979), argued that an analyst's memory is unreliable: In understanding what was meant by a participant's utterance, an analyst relies on nothing but his own unique set of remembered prior texts without having direct access to the participant's set. In investigating how this utterance was interpreted by other participants in the conversation, the analyst again has to use his own set of remembered prior texts, which of course is different from that of the participants. As has been noted, one of the difficulties in the study of conversation lies in the fact that participants' assumptions are not immediately accessible to an analyst. These assumptions seem to be formed and stored in people's memory through their language uses in the past. We will see in our data that even between fairly new acquaintances, in the course of conversation, each participant's unique set of remembered prior texts is adjusted to the other's set, and common assumptions are formed through negotiations. In other words, it is a shared language activity that eventually forms such an assumption. In the relationship between an analyst and the participants, however, these processes of forming common assumptions are not logically available because an analyst typically does not share the conversation with the participants, and therefore lacks the shared memory of language uses with them. (1985: 19-20) Due to the memory barrier, Oishi said, correctly I think, that the actuality of conversation (i.e., text) is "distant" to the analyst and even to the participants. To minimize the effect of memory barrier, an "appropriation of text"12 was suggested by Oishi following Recouer (1981) and Becker (1977); however, the suggestion is not practical for this particular study. Since I desired to find general tendencies within my informants' use of evidentiality expressions, I looked into fairly 173 large discourse data provided by about 60 informants (besdies students), about 20 of which are from public discourse. Therefore, it was difficult to go back to each informant to discuss the data, although review discussions were held with several of the informants concerning the type of proposition and the particular forms of sentence-ending. Some discussions were useful while others were not. However, since I was a participant, I shared the common assumptions formed in our temporal memories with other participants for many discourses. In this sense, I was less helpless than a simple observer-analyst. To make the analysis consistent, after analyzing a few discourse excerpts, I formulated some rules of analysis which I felt necessary in order to minimize my subjective interpretation of the speaker's proposition types. Although the possibility of subjective analysis is unavoidable, an effort was made to mitigate the influence. Rules of analysis Sometimes, it was difficult to properly categorize the nature of a speaker's proposition within the milieu of the seven different information types of [4-20]. For example, public information (i.e., type G) can often be information out of the speaker's territory (i.e., type F) as well as mutually known information if it is experienced in some way by both parties (i.e., type C). In order to make consistent analysis, I formulated the following rules: Rule (1): If the type of a given proposition is ambiguous, for 174 example, ambiguous between (B) and (C), the utterance will be ignored in the analysis. Rule (2): Hedges, conventional greetings, and conventional set- phrases which do not represent their literal meanings will be excluded from the analysis. Rule (3): Incomplete sentences which do not include sentence final modality, and sentences with deontic modality (i.e., modality concerning permission, prohibition, and obligation) will be excluded from the analysis. Rule (4): Information which is out of both parties' information territory and is well-known to most of the community members including the discourse participants and which is known to be known will be categorized as "public information" (G), while information that does not fall into either party's information territory and which is known by some or all participants will be treated as (F) type information. Regarding Rule (4), the informants showed that they distinguish between these two types of public information: (G) and (F). Often a speaker tried to confirm his hearer's knowledge about the public knowledge that he is presenting in order to decide on the mode of the proposition. The next discourse sample is an example: (4-23) F12 (1): a, igirisu, london ni sundeta toki ni Well England london LOC lived time TEMP kanojo ga koten o yattete, she NOM exhibition OBJ did(STAT) sono hanashi, shitteru desho? 175 that story know AUX(CONF) F5 (2) : shiranai don't know F12 (3): aa, honto? Ja, koten o yatteta -n datte. Well, really Then exhibition did(STAT)-n hearsay F12 (1): Well, in England, when they lived in London, she was holding an exhibition of her own, you know the story, don't you? F5 (2): No, I don't know. F12 (3): Well, then, it is said that she was holding an exhibition. In the above conversation in which speaker F12 was talking about Yoko Ono, a famous public figure, she assumed that the hearer knew the famous episode of the first meeting of Yoko Ono and John Lennon, so she presented the proposition in a direct mode in (1) suggesting she was treating the proposition as public truth (i.e., a G- type proposition). But after checking the hearer's knowledge by (2), F12 realized F5 does not know the proposition, then she switched her mode into the hearsay mode (i.e., F-type) in (3). However, not every speaker is this sensitive to the hearer's knowledge about public issues. In that case, the speaker possibly uses only direct mode to present public information which possibly gives the hearer the impression that the speaker is treating the proposition that is out of his territory as if it is in his territory. The next discourse is an example of this type of interaction: (4-24) 176 F2 (1): kawaisooda yo ne. miserable PART(VOC) PART(RAPP) (2): sorede kodomo ga futari mo dekichatte. then children NOM two as many as born(regret) (3): sorede rikon o shinai yooni san-nin me, then divorce ACC do(NEG) in such a way third one tsukuritai-tte itta kedo Chaaruzu, moo iranai. have(DES)-COMP said but Charles any more desire(NEG) (4): sokode moo hitori kodomo o tsukutte-oke-ba then more one child ACC have(te)-(RES)-(COND) warui kekka ni naranai-n-janaika-tte bad result DAT become(NEG)-n-(NEG)-Q-(COMP) iunde itta-n-dakedo, Chaaruzu ga kobanda no yo. so said-n-but Charles NOM rejected PART(VOC) (VOC) Others (5): sugoooi, yoku shitteru nee# great well know PART(SHARE) F2 (1): [Diana is] so miserable, isn't she? (2): Then they had two children. (3): So in order to prevent divorce, [Diana] said [to Charles] she wanted a third one, but Charles[said] he did not want anymore. (4): [Diana] said so because [she thought] they can avoid bad ending if they had the third child, but Charles rejected the idea (I tell you). Others (5): Wow... you know very well, don't you? In this conversation, speaker F2 was talking about the collapse of Princess Diana and Prince Charles's relationship, and since she used the direct mode (as underlined), the hearers (four of them) unanimously reacted to pretend they were impressed by speaker F2's knowledge. 177 However, as the proposition is someone else's very private matter which can hardly be in speaker F2's information territory, others' reaction can be understood as critical. A proposition of this type is usually treated as a (F) type proposition and spoken with hearsay mode. Rule (5): If a given proposition that is public happened to fall in the speaker's or the hearer's, or both parties' information territory, personal territory will be considered to have the primary status. Rule (6): Common sense knowledge which almost everybody agrees to will be considered to be known by "experience" so it falls into proposition type (C). Following the rules above, all applicable propositions were sorted into (A), (B), (C), (D), (E), and (F) as proposed, and two other additional types (G) (public information) and (H) self-talk for experimental purposes (see the next chapter), and within each propositional category, the occurrence of sentence-ending evidential forms was monitored. The process for creating the database for quantitative analysis is illustrated in the following chart, [4-25]. 178 [4-25] Database for quantitative analysis (1) Data collection (recording) (2) Transcription (3) Data input (3-1) Informant data (SITUATIONAL CONTEXT) Code name (e.g, F1, M2)--------------------> Age --------------------------------------> Gender--------------------------------------> (3-2) For each sentence-ending form with clear epistemic modality: (a)Informant's code--------------------------------------> (b)Evidential form information: form of sentence ending-------------------> plain/polite distinction -------------------> group type of the form -------------------> Group (1) - Group (10) D A T (c) Discourse type (SITUATIONAL CONTEXT)---------> A 1) discussion with high formality 2) court interaction (prosecutor/defendant) 3) public talk 4) conversation with low formality with friends 5) conversation with low formality with family members 6) teacher-student interaction at school (teacher/student) B A S E (d) Proposition type (PROPOSITIONAL CONTEXT)-----> (A) ~ (H) 179 CHAPTER 4: NOTES 1Although I did not ask for information regarding social class, I assume the informants would claim that they are middle-class city- dwellers since most of Japanese people claim to be middle-class. All of the informants happened to be office workers (presently or retired) or house wives. But this may not be applicable to all of the student informants in schools I visited. 2 A brief account of each case is given below, which may help readers understand the transcribed speeches used in this dissertation. Yakugai-AIDS case (case of medical products tainted with AIDS virus): In 1996, it was revealed that twelve years earlier, the Japanese Ministry of Health (MOH) had delayed the termination of the use of possibly AIDS-tainted blood products (ketsueki-seizai) imported from the U.S.A. for hemophiliac patients. This happened before the Japanese people became familiar with the disease. Teikyo University found that more than twenty of their hemophiliac patients were HIV positive yet the university continued to use the blood products with the excuse that they were not sure if the patients were really infected by AIDS virus. MOH and its affiliated AIDS research committee led by a Teikyo University doctor were suspected of trying to delay the recognition of the first AIDS patient in Japan. It was suspected that this delay was due to the relationship between the ministry (MOH) and the manufacturer of the blood product, Midori-juji (Green Cross), a pharmaceutical company, run by officials retired from MOH. For more than ten years, the existence of hundreds of AIDS patients who had became infected by this blood product was not well known by the public. Finally in 1996, one young man who is a victim of the case requested public attention, and the newly assigned minister of MOH, who carried out the investigation, disclosed details of the misconduct to the public. This case 180 revealed two problems with Japanese society: problematic cohesion of government and industry which works contrary to the benefit of the public, and the secretive nature of Japanese governmental activities. Aum-shinrikyo case (case of Aum cult): A cult, led by Asahara Shokoo, who claimed to be "God", attempted to seize Japanese Governmental functions. Interestingly, Asahara had a lot of intellectual and successful followers who supported him financially and technically. They invented weapons (conventional and biological) and other materials to occupy the country physically and killed those who tried to escape from the cult or who were about to find out what the cult was attempting. They surfaced for the first time when seizure of the governmental body at the Kasumigaseki area was attempted by strewing Sarin poison gas in the area. Several core members were involved, and even after Asahara himself was finally arrested, some of them were still at large. Since further attempts to physically seize the governmental control were feared, the police carried out one of the most extensive searches the country had ever seen. 3 For example, the following chart demonstrates the relationship of the group membership of the parties involved with the selection of the verb "to be": [4-26] Different "to be" verbs depending on listener and referent listener referent verbs used by the speaker in-group out-group in-group out-group in-group in-group out-group out-group iru oru irassharu irassharu 181 4 The system of Japanese honorifics has been considered to have two axes: the speaker-addressee axis ("performative" honorifics) and the speaker-referent axis ("propositional" honorifics) (e.g. Harada, 1976, Shibatani, 1990). "Addressee-oriented" honorifics are said to be wide-spread throughout the world. The use of French vous and German sie is an example (Shibatani, 1990:375). Addressee-oriented honorifics do not require the presence of "socially superior to the speaker" in the propositional content of the sentence (Harada, 1976:502). Japanese polite sentence ending (i.e., desu/masu) forms fall in the category of this performative honorifics. For example, the following three sentences in (4-26) have the same referential meaning, "this is a book", but (a) is used to familiar, or equal status addressees in casual speech settings, while (b) is used to someone who is socially distant or higher. (b) is also used among equals or to lower-status addressees in formal settings with bystanders. (c) is used to an addressee who is significantly superior than the speaker, or to anybody in a very formal environment. (4-27) (a) Kore wa hon da. this TOP book COP (b) Kore wa hon desu. this TOP book COP(FOR) (c) kore wa hon degozaimasu. this TOP book COP(hyperpolite) "Referent" honorifics (or propositional honorifics) includes the target of honorific use in the subject position of the sentence ("subject honorifics") or the object position of the sentence ("object honorifics"). 182 Each "performative (addressee)" and "propositional (referent)" honorific usage has three different levels of formality: "plain", "polite", and "hyper-polite" as shown in the above sentences (a), (b), and (c). The axis of performative honorifics and the axis of propositional honorifics are independent from each other except when the subject or the object of a sentence coincides with the addressee or the speaker. Therefore, theoretically six different formality levels are possible. The following sentences (d) to (f') have the same referential meaning, the teacher laughed. Among them, (d) is a plain sentence without either propositional or performative honorifics. Sentences (e), (e'), (f) and (f') are examples of propositional (i.e., referent) honorifics in that the target of the honorific is sensee (teacher). Combination of the nominalized verbal form warai ni (to laugh) with the honorific prefix o- and adverbial complement of the verb naru (become) indicates a form of referent honorific. Sentences (d) and (d') are with the plain level, (e) and (e') are with the polite level, and (f) and (f') are with the super-polite level. In terms of performative (addressee) honorifics, (d), (e), and (f) are in plain form while (d'), (e'), and (f') are in polite form: (4-28) (d) sensee ga warat-ta. ---plain teacher NOM laugh-(PAST) (d') sensee ga warai mashita. ---polite (addressee honorifics) teacher NOM laugh-(FOR)(PAST) plain (referent honorifics) (e) sensee ga o-warai ni nat-ta. teacher NOM HON-laugh HON-(PAST)---plain (addresee honorifics) polite (referent honorifics) (e') sensee ga o-warai ni nari-masita. teacher NOM HON- laugh HON-(FOR)(PAST) --- polite (addresee honorifics) polite (referent honorifics) 183 (f) sensee ga o-warai ni narare-ta. teacher NOM HON- laugh HONhyperpolite-(PAST) --- plain (addressee honorifics) hyper-polite (referent honorifics) (f') sensee ga o-warai ni narare-mashita. teacher NOM HON- laugh HONhyperpolite-(FOR)(PAST) --- polite (addressee honorifics) hyper-polite (referent honorifics) [(d) and (e) are from Shibatani, 1990: 376] Performative honorifics are shown in addressee-oriented sentence-ending forms so they are directly related with the issue of this dissertation. In the use of performative honorifics, the plain form level (da, -ta, etc.) is perfectly acceptable for communication among people who share a close relationship such as family, friends, colleagues of similar age, without any implied disrespect. The plain form may also be used by a speaker in a superior position in informal situations to inferior-status addressees with no connotation of rudeness. The form is not suitable for any kind of formal setting such as meetings or speeches. Polite forms of performative honorifics (desu, -masu, etc.) are used among strangers and distant acquaintances indicating social distance, and are also used by lower-status speakers to higher-status hearers in the same group (family, company, school, etc.) showing casual respect from status differences. Polite forms are as commonly used as the plain forms. The use of hyper-honorifics is limited to formal speech settings. This form of honorifics uses a different lexicon (e.g. to eat is meshiagaru in super-polite form vs. taberu in plain form), or is indicated by an honorific suffix or prefix. There are usually three different types of hyper-polite meanings: humble, exalted, and neutral. 184 5 Considering the long history of Japan, the Japanese language has been standardized only fairly recently. It was started in 1869 at the time of the Meiji-restoration. Japanese people were historically "confined" to their birth prefecture that was governed by a Daimyo (lit. big samurai), without the freedom to leave that prefecture. This policy was maintained for a long time in order to keep farmers "tied" to the land to secure the tax income of each Daimyo. Therefore, there was no communication among the sixty odd local prefectures. This restriction enhanced the development of local dialects. It is reported that during the middle of the Edo-era (i.e. seventeeth century) people were unable to communicate outside of their own prefecture. In addition to local dialects, "class" dialects developed; people in different social classes (e.g. monks, soldiers, general public, women) spoke different "languages". Further, each class used different written and spoken languages. Overall, before language standardization, there were diverse versions of the Japanese language. Then, after the political unification of all prefectures was achieved to establish the nation of Japan as a whole, it was realized that language standardization was urgently needed for "communication convenience" and also for "national unity". This necessity was heightened by the contingency of wars. Language planning started with the collection of data from local dialects to select one standard dialect. The national committee in charge decided to select the Tokyo dialect, and prescribed grammar details including phonological expressions. Written and spoken languages were unified in the standard language. Implementation of the standard Japanese was successfully performed through school education. Rapid development of mass-communication such as TV and radio also helped the implementation to a great extent. Mass-communication has also contributed to shape the standard language to the current form. (e.g. Kamei, et al., 1965a, b; Matsumura, 1986; Mashita, 1953; Sanada,1983; Sato; 1982) 185 6Sanada's quantitative research (1983) in every prefecture in Japan on the range of standardized forms of selected words indicated that Tokyo dwellers scored 61.1% on average. Although it is not as high as a non-dialectologist may expect, the score was the highest among forty-eight prefectures. The Kanto-area prefectures (surrounding Tokyo) were all ranked high: Saitama, 60.8%, Tochigi, 60.7%, Kanagawa, 59.4%, Gunma, 57,7%. Although Hokkaido, the northmost island is ranked next (53.8%), generally, the farther away from Tokyo a prefecture is located, the lower its score was. The southern islands, Okinawa (3.3%) and also prefectures in Kyushu island (25-31%) scored low as well as northern Honshu prefectures (21-27%). 7Dialectal differences entail a variety of linguistic features: therefore, it is difficult to articulate how many regional dialects are spoken in Japan. Dialect maps are drawn to show regional differences in each single feature: phonemes, accent, tone, lexicon, semantic categories, and a number of grammar aspects (e.g. conjugated forms of verbs and adjectives, nominal-adjectives, noun-compounds, particles, honorifics) and others. It has been generally understood that dialectal divisions based on different linguistic features with different dialect boundaries. However, Kindaichi (1977) described general divisions among dialects that support phonological, grammar, and accentual differences among dialects. In Kindaichi's general dialectal map, there are three principle dialect groups: Nairin-dialect, Churin-dialect, and Gairin-dialect, and each group is further divided into twenty-five sub divisions. [A] Nairin-dialect ------------------------------------------(5 1. Standard Ko-type dialect 2. Tosa dialect sub dialects) 3. Western Kagawa prefecture dialect 4. Eastern Kagawa prefecture dialect 5. Southern Noto dialect [B] Churin-dialect------------------------------------------(10 sub dialects) 186 (a) Standard Otsu-type dialect 1. Eastern Japan Churin dialect (Tokyo, Kanagawa, etc,) 2. Western Japan Churin dialect (i) Noobi dialect (ii) Totsugawa dialect (iii) Chugoku dialect (iv) Shikoku Inan area dialect (v) Northeast Kyushu dialect (b) Quasi-Ko-type dialect 1. Hokuriku dialect 2. Sekiho, Nagahama dialect 3. Kumanonada dialect 4. Shikoku Uwa area dialect [C] Gairin-dialect------------------------------------------(10 sub dialects) (a) Eastern Japan Gairin dialect 1. North Oh-u, Hokkaido dialect 2. South Oh-u, Northern Kanto dialect (b) Hachijoo-jima dialect (c) Ooigawa, Yamanashi-Narada dialect (d) Northwest Noto dialect (e) Izumo, Oki dialect (f) Kyushu dialect 1. Chikuzen, Iki, Tsushima dialect 2. Miyazaki dialect 3. Northwest Kyushu dialect 4. Satsuma, Goshima dialect 8Sentence-ending forms are described in standard Japanese. The data contains limited numbers of dialectal forms (most of them are from the Kansai area); they are "assimilated" in description into the standard forms in quantitative analysis. The following are dialectal forms included in the list: (local dialect) (standard forms) (meanings of expression) (example) V ta form n-ka na ------------>Vta form no ka na I wonder (self-talk) (e.g. atta-n-ka na) (e.g. atta no ka na . ) (I wonder there was..) -n kedo -------------------> -nai kedo direct negative (e.g. shira-n kedo) (e.g. shira-nai kedo) (I do not know) 187 -toru -------------------> -Vte form + iru stative (e.g. ittoru) (e.g. itteiru) (they are saying) -totta -------------------> Vte form ita somebody said... (e.g.ittota) (e.g. itteta) ( Somebody said so) ya -------------------> da, yo direct vocative (e.g. soo ya kedo) (e.g. soo da kedo) (It is so, I am telling you) (e.g. kita-n ya) (e.g. kita no yo) ( Someone came, I am telling you.) -yate -------------------> -datte hearsay -to chigau? -----------------> -janai? tag-Q, negative question -yaro. -------------------> -deshoo. tag-Q -henya-n-ka ----------------> -hen janai? Isn't it strange? -nen -------------------> -n da Explanation (e.g. kireru nen) (e.g.kireru n da) (This cuts, you understand) -yate -------------------> -datte hearsay (e.g. akan yate) (e.g.dame datte) (Someone said "No") 9Oishi characterized ne with rising tone as indicating that information belongs to the speaker's territory. I suspect, however, this rising -ne in the data described by Oishi is the rising version of "rapport ne" in my analysis which simply sends an "I am talking, are you listening?" message to the hearer. Oishi found this ne (in his data) from a single speaker, therefore, the high pitch of rapport -ne may be this individual's personal trait (actually there are some people who habitually do this). In my model, "rising -ne" (as well as rising -yone) involves both parties' knowledge. 10 There are traditional ways to raise declarative sentence- endings meaning questions. Actually this usage is very common, 188 especially in casual speech. However, these "traditional" rising endings in declarative sentences and "quasi-questions" in declarative sentences are different in tone. In quasi-questions, often the very last vowel of the sentence (or of a word) final syllable (cf. Japanese unit of sound is syllable) is prolonged and sharply raised. If a speaker asks a question by raising the end of declarative sentence (e.g. You are a UT student?), sentence- ending is raised naturally and gradually in the sentence-final word. The quasi-question forms are used as a surface presentation of the speaker's willingness to solicit agreement from his hearer, so the form may result in superficial raising of the final vowel of the final syllable. 11 But at least to me, the quasi-question strategy did not sound modest; it was rather annoying in that I felt as if I was bombard with tons of requests for agreement to which I was actually not asked to answer. Often, quasi-question forms are used for type (A) propositions, i.e., information which is exclusively known to the speaker which does not need to be agreed/confirmed by the hearer. 12Oishi (1985:33) quoted Ricouer in order to explain the concept of "appropriation": If it is true that interpretation concerns essentially the power of the work to disclose a world, then the relation of the reader to the text is essentially his relation to the kind of world which the text presents. The theory of appropriation which will now be sketched follows from the displacement undergone by the whole problematic of interpretation: it will be less an intersubjective relation of mutual understanding than a relation of apprehension applied to the world conveyed by the work. A new theory of subjectivity follows from this relation. To understand is not to project oneself into the text; it is to receive an enlarged self from the apprehension of proposed worlds which are the genuine object of interpretation. Following Gadamer's analysis in Truth and Method, we shall introduce the theme of "play". This theme will serve to characterize the 189 metamorphosis which, in the work of art, is undergone not only by reality but also by the author (write, artist), and above all (since this is the point of our analysis) by the reader or the subject of appropriation. (Ricouer, 1981: 185) His explanation is rather abstract but in short, it seems Ricouer meant that through "play", the analyst realizes an "enlarged self" and "the actualization of meaning as addressed to someone" (1981: 185), and in this process, the reader (analyst) forgets himself and things he previously thought to be natural in language. Then, what should be done practically in appropriating the text? Oishi himself drew a three- step-framework of his text data: the first step of appropriation followed by a description of the text by the analyst, the second step of appropriation followed by description of the text by the analyst and the participants, and the third step of appropriation by the analyst with the view integrated through the first and second steps. It was emphasized that the interview of the informants by the analyst provides an important appropriation of the text to approach the actuality of a conversation. 190 CHAPTER 5: MODEL OF JAPANESE EVIDENTIALITY In this chapter, I will propose my model of the framework for Japanese evidentiality based on empirical data as well as the theories of the universal concept of evidentiality and the Japanese concept of information territory. THE CONCEPT OF INFORMATION TERRITORY AS BACKGROUND FOR THE MODEL Direct versus indirect evidentiality The Japanese evidentiality system model which I propose consists of two basic types of evidentials that are considered universal: "direct evidence" and "indirect evidence" as in Willett's model (cf. chapter two and appendix C). The principal difference between the universal concept of direct evidence and my model is that direct evidence in my model is not limited to that which the speaker has obtained through direct experience; it includes any information to which he has socially authorized primary access, i.e., information (or propositions) which belongs to the speaker's "information territory" (in Kamio's term). Information other than this is considered to be based on indirect evidence and expressed in structurally indirect forms such as hearsay evidentials and questions. This is the first corollary of the model: COROLLARY 1 (direct/indirect evidentials) : Direct evidentials express a speaker's proposition which falls in the speaker's information territory and to which the speaker has socially licensed primary access in each speech situation. 191 Indirect evidentials express a proposition which does not fall in the speaker's information territory. The speaker's and the hearer's information territory As assumed, the Japanese concept of evidentiality is very deeply related with the knowledge of the speaker and the hearer just like the Kogi language (Hansarling, 1984 by Palmer 1986 in chapter two). Furthermore, Japanese evidentiality is specifically related with the concept of information ownership, and is not a simple matter of "knowing" or "not-knowing". Therefore, as the initial task of this research, it was mandatory to come up with the most realistic model of the speaker's psychological information territory. In the process of reaching the final model of evidentiality through data analysis, I found the fundamental concepts in Kamio's model to be very useful. However, from the viewpoint of evidentiality, Kamio's theory does not fully reflect the reality of informants' use of evidentials, consequently, a new framework was necessary. In the model which I am proposing, a speaker's "knowledge" and the "information in his own territory" are treated distinctively different. In this sense, the condition of being classified as information belonging to the speaker's territory is the most essential corollary in the model. As explained in chapter three, Kamio provided three conditions for the speaker's territory information1 which I modified based on the results of data analysis as follows: 192 COROLLARY 2 (the speaker's information territory): A speaker's information territory contains the following four major types of information: (a) Information obtained through the speaker's past and current direct experience through visual, auditory, or other senses, including the speaker's inner feelings; (b)Information about people, facts, and things close to the speaker, including information about plans, actions, and behavior of the speaker or other people whom the speaker considers to be close, and information of places with which the speaker has a geographical relation; (c) Information embodying detailed knowledge which falls within the speaker's area of expertise (professional or otherwise). (d) Information which is unchallengeable by the hearer due to its historically and socially qualified status as truth. The above corollary suggests that even if a speaker has some knowledge about his proposition if the proposition does not meet at least one of these four qualifications, the proposition does not belong to his territory; it is knowledge out of his territory. These days an individual is destined to be exposed to huge amount of information from various sources. Actually one's daily life is often based on dealing with information, i.e., getting, producing, transferring, evaluating, and manipulating information. Among the assorted information sources, the most reliable one is, naturally, a speaker's direct experience. The information from direct experience is 193 only small fraction of the entire information which a speaker linguistically expresses in direct forms [i.e., condition (a) in Corollary two]. Target information for direct evidentials involves certain types of information besides direct experience as (b), (c), and (d) of Corollary two qualify. This kind of information, theoretically and also empirically speaking, motivates a speaker to be linguistically direct. Examples of the information defined as speaker's information by Corollary two are shown as follows: (a) Information obtained through the speaker's past and current direct experience through visual, auditory, or other senses, including the speaker's inner feelings; (5-1) F26: amerika made dono kurai jikan kakatta ka oboeteru . USA till how long time took COMP remember(STAT)? S2: wasureta. neta yo. forgot slept VOC F26: Do you remember how long it took to go to America? S2: I forgot. I slept. The information, "I forgot" and "I slept", is based on the speaker's direct experience and most genuinely belongs to the speaker's information territory. Both sentences by speaker S2 are direct sentences with direct endings in Japanese. These kind of propositions are sufficiently straightforward as not to require further examples. (b) Information about people, facts, and things close to the speaker, including information about plans, actions, and behavior of the speaker or other people whom the 194 speaker considers to be close, and information of places with which the speaker has a geographical relation; Following two statements are from fathers referring to their sons. Both fathers treat their sons' information as their own as they consider that their sons and matters related to them to be close to themselves: (5-2) M12: ano ima borantia undoo o iroiro yatteru well now volunteer activities OBJ various doing(STAT) mon desu kara ne COMP COP(FOR) ABL PART(RAPP) M12: [my son] is now doing all sorts of volunteer work. (5-3) M1: kare, jibun no shumi de atsumeteru hon ga ne, he himself POSS hobby collecting(STAT) books NOM RAPP eikoku ni ooi kara ne. militarii bukku. England LOC many because PART(RAPP) military books kore wa nee, mukoo iku-to monosugoi ookina this TOP RAPP overthere go-COND tremendously large boodaina korekushon ga aru-n-da. huge collection NOM exist-n-COP M1: He[=my son], the book he collects as hobby are abundant in England. Military books. This is, when you go to England, they have a huge collection of this kind of books. In the following statement, M1 and F18 talked about the current anti-British trend in Australia. Although the speakers are Japanese, they lived in Australia for a long time and even after returning to Japan, they routinely visit Australia every year. All the attendants knew their close relationship with Australia. Therefore, the speakers 195 are considered to be entitled to speak about the country as their close information. This is an example of a direct evidential of close "geographical relationship" . (5-4) M1 (1) : de ne., ano daiana nanka no ikken ne. . then RAPP that Diana et al. MODI incident RAPP (2): eikoku no ooshitsu ni taisuru ishin ga England POSS crown DAT toward dignity NOM masu masu sagatte-kita. more and more decrease(te form)-came. (3) : kanari oosutoraria no hoshutoo ano very much Australia POSS conservative party that hoshutekina ootouha ga ne., konogoro osaregimi. conservative Tories NOM RAPP these days drop-off F18(4) : eikokukei ga honto sukunaku natta. British people NOM really became few M1 (1): Well, that affair of Diana and the spouse. (2): British royal family is losing prestige [with Australian people] increasingly. (3): Seriously, Australian conservative party, that conservative royalist faction is recently declining. F18 (4): People of British origin have become fewer indeed. (c) Information embodying detailed knowledge which falls within the speaker's area of expertise (professional or otherwise). In the following speech, M15 is talking about multi-media, especially cyber-space and its future, He is a professor of a related field so that his knowledge can be explained with direct evidentials although 196 he must have gained knowledge through indirect channels: (5-5) M15 (1): sukunaku tomo ima no intaanetto no yoona at least current Internet MODI like bunsantekina joohoo sisutem de iimasu-to dispersed information system INS say-COND hijooni ookuno hito ga jibun no hoomu-peegi very much many people NOM oneself POSS home-page no yoo na mono o motte, jibun no sakuhin o MODI like thing OBJ have(te form) oneself POSS creation OBJ oitari dekiru wake desu ne. put able COP(FOR) PART(RAPP) (2):sooshite goku kagirareta hito shika sore o then very limited people only that OBJ mi-ni-konai look-in order to-come(NEG) (3): sonokawari sono hito ni kannshinn o motta instead that person DAT interests OBJ had hito no tame hijyooni fukai mono o yooishite person POSS benefit very deep context OBJ prepare(te-from) oku-tte koto ga yariyasuku naru-n desu. prepare-QUOT COMP OBJ easy to do become-n COP(FOR) (4): syoosuu no masu-media dake desu-to a few MODI mass-media only COP(FOR)-COND soo wa ikanai-n-desu ne.. soo TOP work(NEG)-n-COP(FOR) PART(RAPP) M15 (1): At least, if it is a dispersed type information system like current Internet, an extremely large population can have their own home-page, and display their creations in there. (2): Then, only limited number of people will come to see it. 197 (3): Instead, it will be easier for us to prepare enriched information base only for those who are interested in us. (4): If we rely on a few limited mass-media systems, it won't be like that. In the next example of "professional evidence", speakers M14 and F17 spoke to the public in a TV news program. Although the proposition was not obtained through their direct experience, the speakers transferred their messages as truth as required as professional reporters. In this sense, showing high commitment to the proposition is part of their professional "register". I interpret these as the cases of professional knowledge. (5-6) M14: shijoo saiaku no kibo de shokuchuudoku no kibo history worst MODI size INS food-poisoning MODI scale ga sara ni hirogatte orimasu. NOM further spreading(te-form) COP(FOR) M14: Victims of the food-poisoning which is spreading at a the national-record are further increasing in number. (5-7) F17: Taifuu ga mottomo sekkin-suru no wa asagata typhoon NOM most approach COMP TOP dawn ni naru to iu koto desu. korekara ame ya kaze DAT become QUOT COMP COP(FOR) from now rain also wind mo kanari tsuyoku natte kimasu. also very strong become(te) will become(FOR) F17: It is announced that the typhoon will be closest to the islands around dawn. From now on, rain and wind will get 198 stronger. (d) Information which is unchallengeable by the hearer due to its historically and socially qualified status as truth. This type of direct evidence is similar to one of Givon's (1982) proposition types: "propositions which are to be taken for granted via force of diverse conventions as unchallengeable by the hearer and thus requiring no evidential justification by the speaker" (p.24). The proposition which suffices condition (d) is not the same with public information in that public information is known widely but not necessarily known to be true. (d) Type information is known to be true or agreed to be true. A historical fact is an example. Usually this type of information is common-sense knowledge so as to be described with a direct ending, often with shared-information evidentials. The next discourse involves a matter related with the Japanese governmental administrative system that satisfies condition (d) of Corollary two. (5-8) F5 (1): kondo shoohizei go paasento ni naru-n-datte. this time consumption tax 5 % to become-n-hearsay M4(2): soo soo. it is so F5(3): soo iu-no katteni kimete ii wake. such QUOT-COMP freely decide good aru janai nanka soo-iu no. juumintoohyoo exist isn't there something so-QUOT COMP referendum janakute. NEG(te-form) M4(4): juumin toohyoo. referendum 199 F5(5): katteni kimete ii freely decide wake. good M4(6): katte janai yo. selfish (NEG) PART(VOC) tejun o procedure OBJ funderutake(STAT) wake. (explain) toohyoo suru dankai de. voting do step TEMP F5(1): I heard that the consumption tax will be 5%. M4(2): It is so. F5(3): Can they decide it all by themselves? There is something like such and such (isn't there?) It isn't referendum.. M4(4): Referendum. F5(5): Can they decide it without it? M4(6): They did not decide it all by themselves. There was a process [to lead to the resolution]. At the time of voting. In M4(6), the speaker explained to F5 that the government had not ignored the "public will" in deciding to raise the consumption tax rate; they are members elected by the public and are supposed to represent the public. This proposition agrees with the well-known theoretical background of the political representative system of democracy, and is within the scope of common-sense information. Therefore, the argument that the government did not ignore the public in the matter of the consumption tax raise should be handled as logical truth. This logic in the speaker's mind appeared linguistically in his 200 direct sentences in (6) as "unchallengeable truth". The next example shows a different aspect of condition (d). Speaker F3 experienced the Hanshin Earthquake in 1995, which caused serious destruction in the city of Kobe, a town in Western Japan. Although Japan as a whole has frequent earthquakes and the residents are used to them, Kobe had never had such a serious one and people believed Kobe would never have such an earthquake. Since "Kobe is an earthquake-free city" was a kind of socially accepted truth (but probably not stratigraphically), speaker F3 treated this information as unchallengeable: (5-9) F3 (1): de then, kore wa jishin this TOP earthquake da COP COMP to wa TOP omotta-n-dakedo thought-n-but (2): keikenshita koto experience COMP ga nai shi OBJ NEG also (3): demo nannka kobe wa jishin ga nai-tte iwareteta but somewhat Kobe TOP earthquake NOM NEG-QUOT-said(STAT) kara ABL (4) : watashi ga kite kara nankai ka atta-n-dakedo I NOM came since a few times happened-n-but (5) : sonnani ookii jishin ga kuru to wa such big earthquake NOM come COMP TOP yume ni mo omowanai janai.. dreamLOC eve think(NEG) don't we F3 (1): Then, I thought this was an earthquake, but (2): I had no experience, and 201 (3): But because somewhat Kobe was said to have no earthquake (4): There were a few earthquakes ever since I came [to Kobe] but.. (5): We did not think such a big earthquake would come even in a dream, did we. The speaker said line (5) "We did not even dream that we would have such a big earthquake, did we." as a socially accepted natural assumption shared by people. This is a kind of common-sense thought which should be considered to belong to the direct information territory of everyone (who lives in the area). Since the topic of this case involves geographic information, the case can also present "geographic closeness" of condition (b) of Corollary two. In the discourse, the speaker used an indirect ending for sentence (3) probably because of the 'distance' which she still felt with the area. She said that she moved to the area five years prior to the incident and did not consider herself to be a real 'local' resident yet. In summary, if certain information meets one of the four conditions of Corollary two, the information belongs to the speaker's information territory and he is entitled to use direct evidentials to express the information. Otherwise, the information belongs to someone else's information territory and so, in my model, even if the speaker has knowledge about the information, the use of indirect evidentials is desirable. 202 For a speaker, "other people's information territory" includes his hearer's information territory. It seems very important to clarify the conditions for information to be in the hearer's territory. Logically, Corollary two conditions should be straightforwardly applicable to characterize information in the hearer's territory. I think it is necessary to assume that a speaker has the same kind of criteria for the hearer's authorized information ownership. This leads to the next corollary: COROLLARY 3 (the hearer's information territory): A hearer's information territory which is assumed by the speaker contains the following four major types of information: (a) Information obtained through the hearer's past and current direct experience through visual, auditory, or other senses, including the hearer's inner feelings; (b) Information about people, facts, and things close to the hearer, including information about plans, actions, and behavior of the hearer or other people whom the hearer considers to be close, and information of places with which the hearer has a geographical relation; (c) Information embodying detailed knowledge which falls within the hearer's area of expertise (professional or otherwise). (d) Information which is unchallengeable due to its historically and socially qualified status as truth, and shared by the speaker. All these hearer conditions are applied to the knowledge status of the hearer as assumed or presupposed by the speaker. Presuppositions and assumptions are based on some kind of evidence; therefore, naturally this corollary for the hearer side is related to evidentiality. 203 Information in the hearer's territory but not in the speaker's territory is part of the target of the indirect evidentials which are to express information to which the speaker does not have direct socially authorized access. This framework, in which direct experience and indirect experience are contrasted, is, as is noted, based on the universal concept of evidentiality and is also relevant to the mental-space model in which direct and indirect memories are contrasted. As was described in chapter three, the mental-space theory (e.g, Takubo and Kinsui, 1990) argues that both hearer's knowledge (assumed by the speaker) and other indirect information for the speaker reside in the speaker's indirect memory space, and are accessed and described through indirect linguistic forms. I believe this concept is logical. In my model, indirect evidentials have two target-information sub-types: the information which the speaker assumed to be hearer's, and information which is neither in the hearer's nor in the speaker's territory. The conditions from Corollaries one, two, and three are summarized figuratively in the following diagram: 204 [5-10].@Direct/indirect evidentials and speaker's/hearer's information territory in the model (A) direct information in the speaker's evidentials territory Evidentials-- information only in the hearer's territory (B) indirect evidentials information outside of both speaker's and hearer's territory Information shared by the speaker and the hearer As the next stage, it is necessary to position information which is "shared" by both speaker and hearer in the model. Data from the informants indicated that there are a few different situations in which certain information is shared. Kamio's model has some problems concerning the issue of shared information. In his early model, in short, Kamio assumed one information category was shared by both speaker's and hearer's information territories (i.e., territory A in [4-21]). This shared information category was divided into three different levels in his later study (1994) as introduced in chapter three (pp. 80-81) as cases (B), (BC) and (CB) shown below again: 205 [5-11] Three types of shared information between the speaker and the hearer by Kamio (1994) (B) the speaker considered that a given piece of information falls completely into both the speaker's and the hearer's territory of information [i.e., information is completely shared]; or information falls completely into the hearer's territory, and only partially into the speaker's territory. (Case B: nHearer>n) (CB) the speaker assumes that information falls within his own territory to some extent but falls more deeply within the hearer's territory (but the speaker does not necessary assume that it falls into the hearer's territory to the fullest degree). (Case CB: n’