LINGUISTIC CODING OF EVIDENTIALITY IN JAPANESE SPOKEN DISCOURSE
AND JAPANESE POLITENESS


by


Nobuko Trent, B.A., M.A.


Dissertation
Presented to the Faculty of the Graduate School of
the University of Texas at Austin
in Partial Fulfillment of
the Requirements
for the Degree of
Doctor of Philosophy


The University of Texas at Austin
December 1997



Copyright 1997 by Trent, Nobuko 
All right reserved 

TABLE OF CONTENTS 

Chapter 1. Introduction 1 
Chapter 2. Theories of linguistic evidentiality 26 
Chapter 3. Discourse modality in Japanese 69 
Chapter 4. Methodology 114 
Chapter 5. Model of Japanese evidentiality 188 
Chapter 6. Japanese linguistic politeness and evidentiality 338 
Chapter 7. Conclusion 412 
Bibliography 432 


GRAMMATICAL ABBREVIATIONS 
ABL ablative case (kara) 
ACC accusative particle (o) 
AD HON addresee honorifics 
AUX auxiliary 
CAUS causative affix (sase) 
CNT contrastive (wa ) 
CONF sentencial particle for confirmation (ne) 
COMP sentencial complementizer (no, koto, etc.) 
COND conditional affix (to, tara, eba, nara) 
CONJ conjecture (daroo, etc.) 
COP copula (da, desu) 
DAT dative particle (ni) 
DES desiderative affix (tai) 
DIR directional case (e) 
EMP emphathetic 
FOR formal (=AD HON) 
GER gerund affix (te) 
HON honorific form 
HYP hypothetical 
IMP imperative 
INF infinitive (o, i, ku) 
INS instrumental particle (de) 
INJ interjection and hesitation 
IRR irrealis 
LOC locative particle (ni, de, e) 
MODI noun modifier (no) 
NEG negative morpheme 
NML nominalizer (no) 
NOM nominative particle (ga) 
OBJ object marker (o) = ACC 
PART sentential particle: VOC, RAPP, CONF, SHAR 
PASS passive affix 
PERF perfect affix 
POSS possessive 
POT potential affix (re, rare) 
PROG progressive affix 
Q question particle (ka) 
QUOT quotative particle (to) 
RAPP sentential partical of rapport (ne, wa) 
REA realis 
REF HON referent honorifics 
RES resultative affix (te-aru) 
STAT stative affix 
TEMP temporal particle (ni, de) 
TOP topic particle (wa) 
VOC vocative sentential particle (yo, zo, ze, sa) 
VOL volitional affix (yoo) 


CHAPTER 1: INTRODUCTION 

When teaching second or foreign language classes, teachers may 
often note various phenomena of "language transfer" from a student's 
native language to the target language. "Transfer" may be seen with 
any aspect of language. For example, if medicine should always be 
drunk according to a certain language's grammar, it is likely that a 
native speaker of the language would "lexically" transfer the expression 
to drink medicine to his second or foreign language. Language transfer 
can be phonological, semantic, syntactical, or morphological, and is also 
seen at the discourse level such as in discourse organization and 
discourse grammar (cf. T. Odlin, 1989). It is presumable that a language 
learner also "pragmatically" transfers the "viewpoint" (i.e., the way 
reality is viewed) of his native language or native culture to his target 
language. Seeing the same reality, people from different cultural or 
different linguistic backgrounds might perceive reality in different 

ways or at least encode their perceptions in vastly different ways.1 
Thus, even if it is not the case that perceptions differ, the rules of 
different languages (prescriptive grammar rules and/or pragmatic 
rules) certainly must have different emphases in expressing the same 
reality. 

While teaching Japanese to American students, in addition to 
grammatical transfer, I have encountered pragmatic transfer which 
may be due to the cultural differences between Japan and America or 

1 


due to the differences between the pragmatic use of Japanese language 

and that of American English, or most likely due to an interplay of both 

factors. 

In the translated Japanese conversation (1-1) below, for example, 

the speaker presented an extremely low-assertive mode of speech in 

discussing some religious cult members at large who were suspected to 

be responsible for the Sarin Poison Gas case in the Tokyo metropolitan 

subway system in 1995, which instantaneously killed or injured 

hundreds of people. Rising ( . ) and falling ( . ) arrows indicate rising 

and falling tones in the passage: 

(1-1) 

F2: (1) ....that person is, what shall I say, in short, did he make (Sarin 
gas). Well, he made Sarin gas, and should I say he scattered it by 
himself. So, is he a scientist . Aren't most of them specialized in 
that field. So, probably, well, most probably, doing research. 
University research institutes do not have much funding 
generally, so after all, it is said that they entered [the cult group] 
under the condition that they can do whatever scientific 
research they wanted to do. You know, it is said that "religion" 
was a quite different thing for those people. So, it is also said that 
they went into the cult group only because they had desire to 
study more than they could have done at graduate school. So 
should we say they are top class scientists. 

F5: (2) Is that so. 

F2: (3) It is said so. 
(The Original Japanese transcription of this passage is in note 2.) 

In the passage, although speaker F2 was talking about that which 

2 


is generally believed to be true, her "level of assertiveness" is very low. 
Her utterances sound very unsure in English translation but in 
Japanese this type of low-assertive speech is acceptable, or even 
preferred. The speaker used four major techniques to avoid being 

assertive: (1) use of structurally indirect sentences such as it is said;3 (2) 
use of questions and tag-questions; (3) use of lexical items with low 
commitment such as probably and (4) use of hedges (e.g. you know, well, 
and what shall I say). In my pilot study of "hearsay" speech in English 
and Japanese (Trent, 1994), Japanese speakers were observed to keep 
distance between themselves and the topic of their speech by 
consistently using structurally indirect sentences such as I heard.., I 
think.., and it seems.. as well as using question sentences and tag-
question sentences that appeared to constantly seek for agreement of 
the hearers.4 Overall, in comparison with an English speaker's hearsay 
report, Japanese speech was seen as less assertive, and tends to sound 

more uncertain. Being low-assertive may be accepted as modest and 
well behaved in Japanese culture, however, this may not always result 
in being perceived favorably in intercultural communication: the overuse of less assertive speech may be considered "evasive", 
"irresponsible", "ambiguous", or "dubious" in the norm of other 
language environments. 

People may well consider that the less assertive tendency of 
Japanese speech is simply a "cultural" phenomenon. Language and 

3 


culture are said to be "interwoven" and there is a view that language 
structure possibly influences our thought (e.g. Sapir,1929; Whorf, 1956). 
In this study, I will assume that Japanese indirect and low-assertive 
speech is primarily a "linguistic" phenomenon, which can be 
systematically explained through a theory of pragmatics. 

As a native speaker of Japanese, I intuitively feel the existence of 
"rules" which tell us how to be appropriately less assertive and indirect 
in interpersonal communication if we want to be a socially competent 
person in each speech situation. As Clancy (1986) wrote that "Japanese 
rely upon indirection in many common social situations especially 
when they are trying to be polite" (p. 215), the factor that motivates 
pragmatic rules here is politeness which eventually leads us to the 
cultural aspect of the Japanese language. The rules for less 
assertiveness are not so-called a "context-independent grammar", but 
rather are the rules for "performance" (i.e., "context-dependent 
interpretation" by Levinson, 1992). 

Hence, this dissertation is a study of Japanese pragmatics, in 
particular, a study of less assertiveness in interpersonal communication 
in the Japanese language. This study investigates the relationship 
between the language and context that is encoded in the structure of 
language, and eventually the rules are examined in relation with 
linguistic politeness behavior in the Japanese cultural environment. 

SCOPE OF THE STUDY 

4 


There are certainly numerous ways to be indirect in 
communication. Theories of pragmatics--speech act and politeness 

theories, in particular--provide us with insightful thoughts on this 
issue (cf. Lyons, 1983, Searle, 1975). This study specifically attempts to 
explore Japanese pragmatic rules which result in less assertive 

communication through the "evidentiality" concept, which is encoded 
in the language structure.5 What, then, is evidentiality? 

Under his "maxim of quality" for conversational principles, i.e., 
"Try to make your contribution one that is true", Grice (1967, first 
published 1975) assumed two submaxims: (1) Do not say that which you 
believe to be false; and (2) Do not say that for which you lack adequate 
evidence (p. 46). Although conformance to these maxims is expected 
among rational adult speakers, one does not always have solid evidence 
for what one says; therefore, when a given utterance is not supported 
by "adequate" evidence, the speaker usually express low-commitment to 
his proposition in different ways. The study of evidentiality is 

concerned with how this is done. Evidentiality is generally defined as 
"the linguistic means of indicating how the speaker obtained the 
information on which he bases an assertion" (Willet, 1988:55).6 Chafe 
(1986) viewed evidentiality in a broader way so as to cover "any 
linguistic expression of attitude toward knowledge" (p. 271). If an 
individual has direct evidence (e.g. witnessing) on which his assertion 
is based, he will use direct language forms, while he may speak rather 

5 


indirectly when his assertion is based on, for instance, folklore. The 
types of evidence that human beings have (e.g. "attested", "reported", 
and "inferred") must be universal; however, how to express the 
difference such as the difference in evidence types, and the difference 
in "degree of certainty" must vary across languages. Based on these 
thoughts, I believe that evidentiality marking can be a useful concept to 
apply in Japanese indirect, less assertive communication. If Japanese 
speakers' language behavior is overly indirect from the universal 
standard concept of evidentiality, there must be reasons behind the 
Japanese behavior, and this behavior may be systematic enough to form 
a pragmatic rule. 

Evidentiality markings can be seen everywhere; English, for 
example, is said to be abundant with evidentials (c.f. Chafe, 1986). There 
seems to be two ways to view evidentials. One way is through their 
grammatical categories; English evidentials are expressed with modal 
auxiliaries (e.g. may, must, might, and can), adverbs (e.g. probably, 
certainly, definitely, likely, and possibly), and miscellaneous idiomatic 
phrases (e.g. it looks like, it sounds, and it feel like). The other way to 

see evidentials is through their function types such as "reliability", 
"induction", "deduction", "hearsay", and "sensory". I quote some 
examples below of the functions of English evidentials from Chafe 
(1986): 
[1-2] 

-Evidentials which indicate "DEGREES OF RELIABILITY" 
6 

(a) We kept thinking maybe they'd be stationed at the Presidio. 
-Evidentials which indicate "INDUCTION" 
(b) It must have been a kid. 
-SENSORY evidentials 
(c) I see/hear her coming down the hall. 
-Evidentials which express "HEARSAY" 
(d) They were using more verbs than English speaking kids 
have 
been said to learn. 
-Evidentials which indicate "DEDUCTION" 
(e) He or she 
should take longer to respond following exposure to 
inconsistent information than when exposed to no 
information at all. 
(f) Adults 
presumably are capable of purely logical thought. 
(264-269) 
In addition to the examples above, Chafe extended the scope of 

evidentials and listed "hedges" and "expectation" as other types of 

evidentiality functions. Certainly, this list must neither exhaustive, nor 

functionally appropriate 
cross-linguistically. 

Although there is not yet a substantial study specifically on 

Japanese evidentiality,
7 
some thoughts on the issue have appeared in 

limited ways in the studies of "modality" of sentence (e.g. Nida and 

Masuoka, 1989). The "modality" or "mood" of sentences is another fairly 

un-articulated area in linguistics. Lyon's definition of modality as the 

"opinion or attitude of the speaker" (1977:452) seems to be widely 

accepted. What, however, does "opinion and attitude of the speaker" 

actually mean? Fillmore (1968:23) proposed that any sentence has two 

7 


main constituents: "proposition" as the basic constituent, and "modality" 
(negation, tense, mood, and aspect, etc.). Therefore, logically, all 
sentences have some kind of modality, and the evidentiality factor is 
part of it. In this dissertation, evidentiality is primarily investigated in 
relation with sentential modality. Generally, Japanese sentences mark 
modality explicitly at least at the end of the sentence. This is due to the 
Japanese SOV sentence structure (i.e., S ubject + O bject + V erb sequence), 
which places the verbal element at the very end of a sentence. Of 
course, Japanese has other ways to express a speaker's mood such as 
adverbs, deixis, and idiomatic phrases as English does, but to cover all 
evidentiality phenomena would make the scope of this study too broad. 
Thus, the main objective of this research is to examine evidentiality in 
terms of the sentence-final modality. 

The purpose of this dissertation then is fairly straightforward: to 
examine the interpersonal communication of Japanese speakers, 
seeking to provide a theoretical construction of Japanese pragmatic 
rules, evidentiality rules in particular, which result in the standard 
speakers' preference of less assertive and indirect forms of the 
language. 

The next chapter briefly overviews existing linguistic theories 
on evidentiality in general as well as work focusing specifically on 
Japanese, particularly in relation with sentence modality. 

Chapter three discusses the lack of assertiveness in Japanese 
from the perspective of evidentiality. As noted earlier, there has not yet 

8 


been significant study of evidentiality in Japanese; the concept of 
evidentiality itself has not yet been paid sufficient attention to. One 
insightful ideal construct which may have some relation with the issue 
was proposed by Kamio (1979, 1985, 1987, 1990, 1994) in his theory of 
information territory of the conversationalists. Kamio argues that a 
speaker chooses different sentence-ending modalities to indicate the 
"territory" that he considers the information to belong to: the topic can 
be in the territory of the speaker if it is, for example, about his dinner 
plans; it can be in the territory of the hearer if it is a question about the 
hearer's health; or it can be shared by both speakers' territories if it is 
about a mutual acquaintance. The theory sees the "distance" between 
the topic and the conversationalist from the viewpoint of an 
information territory that each speaker has. 

Although Kamio did not emphasize the question of evidentiality, 
the theory is fundamentally related to the issue of the concept of 
evidentiality in that both concepts deal with how a speaker 
linguistically expresses the degree of psychological distance which he 
feels between himself and the topic. 

Chapter three also explores the issue of Japanese low 
assertiveness from the viewpoint of discourse management. 
Considering Kamio's theory and the concept of evidentiality raises the 
possibility that the Japanese concept of evidentiality involves not only 
the distance between the speaker and the topic, but also the distance 
between the hearer and the topic. From this perspective, it follows that 

9 


the Japanese evidentiality system is very hearer-sensitive. Takubo 
(1990, 1992) and Takubo & Kinsui (1990) argue that a speaker 
continually monitors the hearer's knowledge of the ongoing topic and 
selects appropriate linguistic forms to show this understanding. They 
analyzed Japanese deixis and some sentence-ending forms from this 
perspective. I found Takubo and Kinsui's perspective to be useful for 
the pragmatic conceptualization of evidential markings in that the 
distance between the topic and the participants is the key issue in 
Takubo and Kinsui's theory. They used the metaphorical idea of 
"memory storage" in the human brain: information the speaker stores 
in his direct memory and information which the speaker assumes his 
hearer has that is stored in the speaker's indirect memory are always 
referred to by the speaker to manage the discourse. As I understand it, 
the idea of "direct/indirect memory storage of the participants" of 
Takubo and Kinsui is relevant to the concept of "distance between the 
topic and the participants". 

Chapter four explains the nature of the data on which this study 
is based and discusses the method of analysis. Discourse data of natural 
speech was collected from a variety of speech situations to which 
approximately sixty people from diverse age-groups contributed. Since 
the final goal of this research is to relate the Japanese system of 
evidentiality marking to the Japanese concept of linguistic politeness, 
in the analysis, which is both qualitative and quantitative, the degree of 
formality of speech settings is considered to be the main variable which 

10 


decides the speaker's choice of evidentiality markings. Other variables 
include the speaker's demographic data, the propositional content of 
the utterance, and the sentence-ending evidential form used for the 
utterance. The relationships between these variables are analyzed from 
the perspective of evidentiality. A custom database was developed and 
used in order to facilitate quantitative analysis. 

Chapter five proposes a model of the Japanese evidentiality 
system based on the data and analysis from the preceding chapters. It is 
demonstrated that the Japanese system of evidentiality marking can be 
systematically explained by the concept of Japanese speaker's 
awareness of the information territories of the participants: a speaker is 
aware of the socially acknowledged "owner" of a topic, being 
particularly sensitive to his hearer's knowledge, and linguistically 
expresses his awareness of status of information. In doing so, a speaker 
may intentionally overextend his hearers' information territory so as to 
include the speaker's own information territory. In this way, the 
speaker linguistically pretends that participants share his information. 
This pretention makes his speech less assertive in that the speaker asks 
for his hearers' agreement continually during his speech. At the same 
time, a speaker may also be cautious and make his information territory 
appear smaller than it actually is by exaggerating the distance between 
the topic and himself. The speaker may do so by making his speech 
structurally indirect. In this sense, how the distance between the topic 
and the communication participants is expressed from different 

11 


perspectives is the core in Japanese evidentiality marking. The 
speaker's emphasis on the distant relationship between himself and the 
topic and emphasis on closeness between his hearers and his topic seem 
to be motivated by the speaker's desire to be polite in interpersonal 
communication. Actually "be indirect" and "show sharedness of 
information" are two of a variety of traditional politeness strategies. 
Politeness factors and rules such as "higher formality" (Fraser, 1990), 
"keep aloof" (e.g. Lakoff 1973a), "don't impose" (e.g. Fraser 1990, Brown 
and Levinson, 1978), more or less, suggest indirectness. Strategies such 
as "show camaraderie" (Lakoff), and "include both speaker and hearer 
in the activity" (Brown and Levinson) may be in line with the "show 
sharedness" strategy. In Japanese, the use of evidentiality expressions 

seems to be a useful linguistic strategy for being polite. 

Chapter six then demonstrates how the Japanese evidentiality 
system is related to Japanese politeness. It is argued that the 
observation of the system proposed in chapter five is pragmatically 
required in the community in the same way that situationally 
appropriate use of honorifics and formal forms are required. 

To discuss politeness in the Japanese language inevitably 
involves the issue of the relationship between language and culture. 
There have been some studies on Japanese politeness in areas such as 
honorifics (e.g. Hori,1986; Hijikata, et. al. 1986) and women's language 

(e.g. Ide and McGloin, 1991; Wetzel, 1988) that delved into the issue of 
Japanese culture; however, there is as yet no fully conceptualized 
12 


theory of Japanese politeness as a whole. 

Brown and Levinson's "face wants" framework, which has been 
probably most influential, views politeness in terms of sets of strategies 
on the part of discourse participants for mitigating potentially 
threatening speech acts. Their account sees language use as shaped by 
the intention of individuals. In contrast with Brown and Levinson, the 
"social norm" view by Japanese researchers (e.g. Hill et al., 1986), 
argues that politeness is a set of behavior patterns preprogrammed as a 
social norm by those possessing power, such as educators. The social 
norm view may be useful for Japanese culture in that this view sees 
politeness as having a social function. Bourdieu (1977) claims that 
"concessions of politeness are always political concession...practical 
mastery of what are called the rules of politeness, and in particular the 
art of adjusting each of the available formulae...to the different classes 
of possible addressees, presupposing the implicit mastery, hence the 
recognition, of a set of opposition constituting the implicit axiomatics of 
a determinate political order" (p.95, p.218 cited by Fairclough, 1992). 
Referring to Bourdieu, Fairclough (1992) suggests that to investigate 
politeness conventions is to gain insight into social power relationship. 
I think Bourdieu and Fairclough's view provides the foundation of the 
social norm view of politeness. However, the strategic view of 
politeness by Brown and Levinson should not be dismissed from the 
evidentiality-based viewpoint of Japanese politeness. Conformance to 
evidentiality rules are almost always socially preferred but their use 

13 


can also be strategic. This topic is expanded upon in chapter six. 

Then what in Japanese culture has formed and maintained the 
Japanese politeness concept among the people? This question is explored 
in relation to the concept of "territory" in the following chapter seven. 
The insightful concept of "high context" culture versus "low context" 
culture which was originated by Hall (1976) and pursued by his 
followers (e.g. Ting-Toomey, 1985; Cohen, 1987), seems to be useful in 
understanding Japanese culture as contrasted with the Western 
cultures. Although researchers have presented a variety of distinctive 
differences between the two, in short, high-context cultural behavior is 
described as indirect, allusive, group-oriented, and shame-oriented. 

Japanese culture is described as being entirely high-context.8 On the 
other hand, Western cultures such as the American culture are termed 
as low-context and are characterized as direct, individualistic, and guilt-
oriented. These differences may be seen covertly or overtly in all 
aspects of human life including systems of law, trials, politics, and 
education. Language behavior, in particular, may present one of the 
most crucial distinctions between high- and low-context cultures. So, 
the Japanese evidentiality system that makes utterances less assertive 
and culturally acceptable in that way may be attributed to the high-
context Japanese culture, which may be sensitive to the distinction 
between outsiders and insiders (i.e., group territory). Concluding this 
study, chapter seven discuss this cultural issue behind the Japanese 

14 


evidentiality system and linguistic politeness. 

15



CHAPTER 1: NOTES 

In this dissertation, quoted conversational samples are written in 
the format "(x-y)", where x is the chapter number and y is a sequence 
number. For example, (1-2) refers to the second sample in the first 
chapter. Charts, tables, and figures are written in the same fashion 
with the exception that they use square brackets rather than 
parenthesis. For example, [1-3] refers to the third sample, a chart, 
table, or figure, in the first chapter. 

1Although a discussion of the relationship between "language" 
and "thought" is not the topic of this dissertation, it is certainly related 
with this study since how Japanese speakers develop their concept of 
evidentiality must depend on a given cognitive environment, Japanese 
culture. 

The language-thought issue is often referred to in children's 
cognitive development; naturally we all underwent the process of 
building up our cognitive system when speaking our native 
language(s). Young children rapidly acquire their native language 
while organizing their experiences into concepts. How do they 
accomplish these two critical cognitive tasks? Do the linguistic patterns 
influence how they view reality? Whorf and Sapir, Piaget, Chomsky, and 
Vygotsky are major theorists in the classic works on perspectives on 
language and thought. A brief explanation of their theories follows. 

The Linguistic Relativity Hypothesis advocates that the structure 
of the language one speaks affects one's perception of the world in a 
way that would be different if one happened to speak another language 
instead. The boldest presentation of this notion was B. L. Whorf (e.g. 
1956), and is known as the Sapir-Whorf hypothesis. Whorf saw 
thinking as largely a matter of language, inescapably bound up with 
systems of linguistic expression: the structure of the language one uses 
influences the way in which one understands one's environment. 

16 


Therefore, according to his theory, the picture of the universe differs 
from one language to another. This notion of language determinism has 
been criticized as being too strong, and scholars (e.g. Lenneberg, 1967) 
criticized this notion for lack of evidence, but a weak version of the 
Whorfian hypothesis, which says that lexical items and linguistic 
structures that a language provides can have an important influence on 
thought, seems to be more acceptable. 

Piaget (e.g. 1968) demonstrated his insight about language-
thought relationship in his theory of developmental sequence of stages 
in human cognitive development. In the Piagetian theory of "cognitive 
determinism", children learn about the world first, build a cognitive 
structure, then map language information on to the cognitive structure. 
Therefore, in Piaget's theory, language does not cause or affect a child's 
cognitive development. 

Chomsky (e.g. 1975, 1980, 1988) proposed the concept of a 
"language acquisition device", an inborn human mechanism to acquire 
language (syntax, in particular). His assertion is based on three 
assumptions. First, grammars are creative generative rules that enable 
a speaker to produce an infinite number of sentences which he has 
never heard. Second, a child's linguistic environment is too 
"impoverished" to provide a child with a "perfect" model of language 
use, in that adult speakers make errors and use incomplete sentences or 
indirect expressions. So it does not seem that children could deduce the 
structure of language from the finite and imperfect sentences which 
they hear. Third, despite this unfavorable environment, the process of 
language acquisition is fairly uniform across languages. These 
assumptions illustrate the miraculous nature of language development. 
So Chomsky concluded there must be a highly abstract innate structure 
that constrains language acquisition (particularly syntax). Therefore, 
the human biological aspect is more emphasized in Chomsky's theory of 
language development than environmental factors such as culture. 

Vygotsky's "interactionist" approach assumes that higher level 

17 


thought processes are derived from social interaction (e.g. Vygotsky, 
1962). Vygotsky advocated that language plays an important role in 
human cognitive development although both language and cognition 
begin as independent processes (by the age two), but soon this 
prelinguistic thought interacts with language, and thought is gradually 
transformed by it. Once a child establishes the connection between his 
experience and language, development in each will influence the other. 
This is why Vigotsky was particularly concerned with the field of 
education, particularly literacy and child development. 

2 Original Japanese utterances of discourse (1-1) 

(1) aa, 
Well 
soo. 
so 
Ano 
that 
hito 
person 
ga 
NOM 
ichiban nante iu no, 
most what shall I say 
yoosuruniin short 
tsukutta 
made 
. 

(2) 
sarin o tsukutte yoosuruni jibun de maita -tte 
Sarin OBJ make(te-form) in short self INS scattered QUOT 
iu ka... 
say wonder 

(3) 
yoosuruni kagakusha . 
in short scientist 
(4) 
hotondo ga daigaku no toki ni sooiu bunya o 
most NOM univ. MODI time TEMP such field OBJ 
senmon to shite yatteta hito-tachi . 
major make(te) did(GER) people 

(5) 
dakara tabun tabun-tte iu ka yoosuruni kenkyuu . 
therefore probably probably-QUOT wonder in short research 
(6) daigaku no 
kenkyuujo-tte shikin ga amari nai 
univ. POSS research center-QUOT fund NOM much NEG 
kara kekkyoku jibun ga ima yatteru-no o 

so 
eventually self NOM now doing-NML OBJ 

18 


nandemo sukinayooni tsukur-asete ageru-tte iu 

whatever as pleased make-CAUS give-COMP 

jyooken de yappari soo-iu-no ga haitta riyuu 

condition INS as expected so-COMP-NML NOM entered reasons 

ga soo-iu-no mo aru-n-janai-ka to wa 
NOM so-called-NML also exist-n-NEG-COMP QUOT CONT


iwa-reteru kedo ne.
say-PASS but RAPP


(7) 
dakara kenkyuu shitakute daigaku de wa dagakuinn 
therefore research want (te-form) univ. LOC CONT grad.school 
toka de benkyoositeru ijyoo-ni motto benkyoo shitai-tte iu 

such as INST studying more than more study want-QUOT 

ishi to iu no toka mo atte itta-n-janai ka to 
desire COMPcalled NM etc also exist(te-from) went-n-NEG QUOT 

mo iwa-rete-iru no ne.
also sa -PASS STAT VOC RAPP


(8) 
dakara moo toppu reberu no kagakusha-tte iu-ka... 
therefore EMP top level MODI scientist -COMP I wonder 
3"Indirect speech" in this research is different from "indirect 

illocutionary acts" (Searle, 1975). According to Searle, an illocutionary 
act can be produced indirectly when the syntactic form of the utterance 
does not meet the illocutionary force of the utterance. For example, the 
syntactic form of the utterance could you keep quiet? is yes/no 
interrogative while its illocutionary force is actually "directive" (i.e., be 
quiet). On the other hand, a "direct illocutionary act" is issued when the 
syntactic form of the utterance matches the illocutionary force of the 
utterance. For example, the utterance you are fired is syntactically 
"declarative" and its illocutionary force is "declaration" . 

Indirect speech in this dissertation simply means structurally 
(syntactically and morphologically, in particular) indirect speech 

19 


which is often expressed by complex sentence structure (in case of 
English) in that the matrix verb-phrase has some modality of 
indirectness. The utterance it looks like he is failing the course is 
indirect in terms of assertiveness as well as evidentiality as opposed to 
the direct the statement he is failing the course. Questioning forms are 
also indirect in terms of the speaker's degree of assertiveness, and tag-
question sentences are also structurally less assertive. 

4In my paper about hearsay speech (Trent, 1974), I pointed out 
two possible causes of the Japanese preference of indirect sentences. 
One is the speaker's concept of speech territory; hearsay does not 
belong to the speaker's information territory so the speaker ought to 
express distance between the information and himself through indirect 
sentence forms. The other factor is simply syntactical. Japanese 
sentences have an SOV structure in that a verbal constituent always 
comes at the sentence ending. I assumed that with an SVO sentence 
structure, as in English, a speaker is not necessarily required to repeat 
the same verb phrase of hearsay ("I heard", for example) to tell a 
hearsay story; if he is telling five sentences of hearsay, the first I heard 
phrase may possibly cover the whole discourse. However with an SOV 
sentence structure, if a speaker tries to minimize the use of I heard 
phrase, he needs to put I heard at the very end of the whole discourse. 
This is not acceptable because, in this way, there is no way for the 
hearer to know the speech is about hearsay before the very end of the 
discourse. Therefore, SOV language speakers may tend to repeat the V (I 
heard ) at the end of every sentence. I found that many Japanese 
speakers preferred making hearsay sentences incomplete and connect 
them by using te-form of verbs at each sentence. By doing so, a speaker 
is able to make the whole discourse sound as if it is an extremely long 
single sentence ("te-linkage"), and the speaker simply puts verb-
phrases which indicate that the information is hearsay (e.g. I heard, it 

20 


seems, I think) at the very end or at the beginning of the discourse. So, 

this is, in a sense, a Japanese speaker's strategy to avoid the 

inconvenience of an SOV sentence structure when one has to repeat the 

same verb phrase. This is my hypothesis, and I have not investigated 

with other SOV language speakers' behavior. 

An example of te-linkage is shown below (English translation of the 

discourse immediately follows): 

(1-3) 

(1) M3:Maikeru Jakuson ga jyuusansai no otokonoko o 
Michael Jackson NOM 13 years-old MODI boy OBJ 
tsurekonde 

bring 
in (te) (te-incomplete) 

(2) 
nani shita-n-kana? nani shitatte, nanka seishikini wa 
what did-n-Q what do(te) somewhat officially CONT 
happyo saretenai kedo chairudo molesuteision 

. announce(PAS) (NEG) but child molestation (Noun-ending) 
. 

(3) 
sono otokonoko ga beddo de konna koto o 
that boy NOM bed LOC like this matter OBJ 
sareta toka itte, 

did(PASS) such say (te) 
(te-incomplete) 

(4) 
uttae o motteitte, 
claim OBJ bring (te) (te-incomplete) 
(5) 
moo sorosoro keijisaiban ni narookana-tte iu chokuzen 
yet shortly criminal trial become COMP just before 
de wakai ga seiritsu site 

TEMP conciliation NOM establish (te) 
(te-incomplete) 

(6) 
de, okane, wan milion ka tuu milion ka moratte 
then 
money one million or two million or receive (te) 
(te-incomplete) 

(7) 
fairu wa nakatta koto ni sita kedomo 
filing SUBJ happened(NEG) COMP made but... 
21 


(8) 
demo dakara sono ko kara no uttae wa 
but because that boy from MODI charge TOP 
nakatta kedo ima keisatu gawa ga nannka 

happened(NEG) but now the police side NOM somewhat 

kenji-gawa toshite sore o saiban ni motte-iku 
procecuters'side as that OBJ trial TEMP bring(te) go 

toka dounokouno yattoru to omou 

such such and such doing QUOT think. (Indirect) 
. 
. 

(9) Int.:By the way, do you know something about the relationship 
between Michael Jackson and Elizabeth Taylor? 

(10)M3:iya nanka, naka ga ii kedo.... 
well somewhat relationship NOM good but.... 

(11) 
nannka Maikeru Jakuson ga sono saibanzata ni 
somewhat Michael Jackson NOM that trial matter 
nari hajimete tuaa o ichinichi futuka 

become start(te) tour OBJ one day two days 

yooroppa de yatte, de nokori canserusite 

Europe LOC did (te) then the rest cancel (te) 
(te-incomplete) 

(12) 
amerika ni kaetta kana tte ittotta kedo jituwa 
America LOC returned Q COMP said but as a mater of fact 
kaette nakute
return(te) (NEG) happen (te) (te-incomplete)


(13) 
Erizabesu Teilaa no uti ni chotto maa otte 
Elizabeth 
Tayler POSS house LOC shortwhile stay (te) 
(te-incomplete) 

(14) 
aa, jituwa koko ni ottandesuyo-tte nanka 
Oh, as a matter of fact here LOC stayed QUOT somewhat 
ni shuukan go gurai ni hyokotto kaettekita 

two weeks after about TEMP unexpectedly returned 

22 


to 
iu..... 

QUOT said. (indirect) 
(English Translation) 

(1) A: Michael Jackson brought a 13 year-old boy in, (TE-ending) 
(2) What did they do? That is not officially announced 
so I don't know well, but child molestation... (noun ending) 
. 
. 
(3) That boy said Michael Jackson did this and that to hibed, (TE-ending) 
m in 

(4) 
[The boy] sued, (TE-ending) 
(5) 
When the case was about to reach the criminal court, 
conciliation was made, (TE-ending) 
(6) 
Then, he got the money, one or two million, (TE-ending) 
(7) 
Then, nothing was filed, (TE-ending) ) 
(8) 
But, even though there was no charge from that boy, 
now, the police are trying to bring the case to court 
being the prosecution, they are doing that sort 
of thing or another, I think, (indirect) 
. 
. 

(9) Int.:By the way, do you know anything about the relationship 
between Michael Jackson and Elizabeth Taylor? 
23 


(10) A:Well, they are somewhat on good friendly terms. (direct) 
(11) 
When the case [above] was beginning to be serious, he 
canceled his European tour after two or three days, 
(te-ending) 

(12) 
They were saying that he returned to America but actually 
he did not return home, (te-ending) 
(13) 
But stayed at Elizabeth Taylor's house for a while, (te-ending) 
(14) 
Then after about two weeks, he came home saying 
he was at Taylor's, it is said like that. (indirect) 
The speaker intentionally avoided completing each 
sentence to connect each to the last indirectness marker I think (8), and 
also it is said in (14). In a sense, he planned his discourse ahead to evade 
saying I hear, I think in each sentence ending. The speech sounds 
fairly informal due to the repeated use of incomplete sentences. I feel 
this is good evidence that basic Japanese syntax influences our hearsay 
discourse. 

5Some Japanese evidentiality expressions (e.g. expressions of 
sensation) appear to be grammaticalized; thus, it is difficult to say if the 
proper use of these expressions is part of sentence grammar or a 
pragmatic requirement. 

Defining "pragmatics", Katz (1977), Kempson (1975) and others 
agreed that grammar and pragmatics are different concepts: 

Grammars are theories about the structures of sentences while 
pragmatic theories do nothing to explicate the structure of 
linguistic construction of grammatical properties and relations... 

24 


They explicate the reasoning of speakers and hearers in working 
out the correlation in a context of a sentence token with a 
proposition (Katz, 1977:19 quoted by Levinson, 1983:8). 

However, on the relationship of pragmatics and grammar, I agree 
with Levinson (1982) in that pragmatics and grammar cannot be 
separated since sometimes aspects of linguistic structure directly encode 
the features of the context ("context-dependent grammar"). 

In Japanese grammar, the use of giving and receiving verbs is an 
example of context-dependant grammar. For example, there are five 
verbs meaning to give: ageru, kudasaru, kureru, sashiageru, and yaru. 
The correct use of these giving verbs requires an analysis of the 
semantic roles of AGENT, GOAL, and OBJECT based on "semantic scenes" 
(cf. Wetzel, 1984). In short, ageru (and the honorific sashiageru) is used 
when giving to an out-group target, kureru is used when giving to an 
in-group target, and yaru is used when giving to a lower-status target. 
This grammar requires a speaker to analyze the context of a particular 
act of giving. 

6 The definition of "evidentiality" varies among scholars. The 
main reason for this is that evidentiality marking is often interwoven 
with other concepts of grammar such as mood and modality particularly 
in terms of epistemology. Details are discussed in the next chapter. 

7 Aoki (1986) is the only study known to focus specifically on 

Japanese evidentiality. It is a short overview of evidential-like aspects 
in Japanese grammar. The study lists evidential-like expressions in 
three areas: descriptions of sensation, hearsay markers, and no-
marking which allows a speaker to assert a statement as a fact even if 
direct evidence is not available (cf. chapter two). 

8In Hall's definition, "context" is "what one pays attention to". He 

25 


explained that culture functions as a selective screen of our information 
in-taking; culture designates what we pay attention to and what we 
ignore. In high-context culture, awareness of the selective process is 
high whereas in low-context cultures people's awareness of that is low. 
The process of screening is called "contexting". Hall defined cultures 
such as those of the American Indians, in which people are deeply 
involved in each other, to be high-context cultures, and defined 
individualistic cultures --such as those of the Swiss and the German--in 
which there is relatively little involvement with people to be low-
context cultures. (1989: 39-40) 

26



CHAPTER 2: THEORIES OF LINGUISTIC EVIDENTIALITY 

WESTERN THEORIES OF MODALITY AND EVIDENTIALITY 

The study of evidentiality as a linguistic topic has a long history 
starting with Greek and Platonic tradition and prevails to this day in 
philosophy. It has become a linguistic issue in dealing with sentential 
modalities. The word 'modality' in the English language finds its root in 
the Latin modus (manners). Although there are perspectives that do not 
acknowledge modality as an independent grammatical category as 
"tense" or "aspect" is acknowledged to be, the fundamental premise of 
this dissertation is that both modality and evidentiality are grammatical 
phenomena, and both categories are treated in that way. As a matter of 
fact, in traditional English grammar, modal auxiliaries such as may, can, 
must, shall and certain verbal endings have been considered a category 
that presents the mood of the sentence. Earlier in this century, 
logician von Wright (1951) proposed four groups of modals: alethic 
modes (modes of truth); epistemic modes (modes of knowing); deontic 
modes (modes of obligation); and existential modes. He claimed that the 
modal concept as a whole is concerned with the concept of "necessity 
and possibility". In modern times, in linguistics, a new viewpoint 
regarding modals was proposed. Linguists (e.g. Fillmore, 1968; Lyons, 
1977) assumed that a sentence is constructed with two basic components: 
a propositional element (the core part of the sentence) and a modal 
element (e.g. tense, aspect, and mood). As previously noted, Lyons 

27 


defined modality as the "opinion or attitude of the speaker" (1977:452) 
toward the proposition as expressed by himself. 

Evidentials are defined by Chafe (1986) in the "broad sense" as 
marking epistemology, coding the speaker's attitude toward his 
knowledge of a situation, and in the "narrow sense" as marking the 

source of knowledge (1986:262).1 In proposing two dimensions of 
evidentials, Chafe suggested that evidentiality is nearly equivalent with 
modality. Certainly, in general opinion, evidentiality as a semantic 
domain is considered primarily modal. The notion of modal or modality 
is less clearly defined, but it is commonly agreed that evidential 
distinctions are a subset of "epistemic modality" marking (e.g. Lyons 
1977, Bybee 1985, Palmer 1986). In epistemic modality, the notions of 
evidentiality, i.e., necessity and possibility, are viewed with respect to a 
speaker's knowledge and belief upon which he bases his judgement of 
the necessity/possibility that the proposition is true. The following 
chart [2-1] indicates a summary of the existing views about the position 
of sentence evidentiality in the category of sentence modality (e.g. 
Lyons, 1977; Palmer, 1986; Bybee, 1985). One linguistic view regards 
evidentiality as the synonym of epistemic modality (e.g. Willet, 1988). In 
the other view, evidentiality is narrowly defined as being a part of 
epistemic modality which concerns with source of information as 
shown in [2-1] (e.g. Palmer, 1986). 

28



[2-1] Epistemic modality and evidentiality 
Evidentiality 
(Concerned with 
source of 
information. 
(e.g. hearsay, 
report, 
senses.) 
Epistemic modality 
(Truth-oriented, concerned 
. 
@with matters of belief, 
knowledge, opinions, etc. 
A speaker qualifies his 
commitment to the truth of 
his proposition.) 

Judgement of 
necessity and 
possibility 

(e.g. speaker's 
speculation, 
deduction) 
Modality 

(Speaker's 
opinions and 
attitude to his 
proposition) 

Deontic modality 

(Agent-oriented, concerned with 
the necessity or possibility of act 
performed by a morally responsible agent. 

(ex. John may come. (Permission-deontic 
possibility)

 John must come. (Obligation-deontic necessity) 

Epistemic modality was defined by Palmer (1986) as "showing the 

status of the speaker's understanding or knowledge; this clearly 

includes both his own judgement and the kind of warrant he has for 

29 


what he says" (p.51). Palmer meant that there are two systems of 
epistemic modality: one is the speaker's judgement of necessity or 
possibility, and the other is evidentiality. Palmer also indicated how 
these two systems work differently from one language to another. He 
cited English as an example of a language with grammaticalized 
epistemic judgement, and German and others as languages that appear 
to combine the two in a system of grammatical marking. Palmer, 
although having defined evidentiality as being different from epistemic 
judgement, in analyzing various languages, often involved judgement 
type epistemic modalities such as "deductive", "speculative" and 
"assumptive" (his terms) in the scope of evidentiality. Indeed, whether 
we should separate "pure" evidentials (i.e., source of information) from 
epistemic judgement (i.e., statements of necessity and possibility) seems 
to be a persistent problem because a speaker's judgment is based on his 
qualification of evidence. 

Chung and Timberlake (1985) claim a different framework for 
mood, which combines epistemic judgement and epistemic evidentiality 
(in Palmer's terms) together in one category. In doing so, their main 
attention was on the contrast between a realis and an irrealis world: 

Mood characterizes the actuality of an event by comparing the 
event world(s) to a reference world, termed the actual world. An 
event can simply be actual (more precisely, the event world is 
identical to the actual world); an event can be hypothetically 
possible (the event world is not identical to the actual world); the 
event may be imposed by the speaker on the addressee; and so on. 
Whereas there is basically one way for an event to be actual, 
there are numerous ways that an event can be less than 
completely actual. For this reason our discussion of mood is 

30 


concerned principally with different types of non-actuality. 

It is also clear, however, that languages differ significantly as to 
which events are evaluated as actual (and expressed 
morphologically by the realis mode) vs. non-actual (and 
expressed morphologically by their irrealis mood). 

(1985:241) 

It must be true that the ways to show realis/irrealis are certainly 

diverse among languages; some languages may have grammaticalized 

rules to mark realis and irrealis, some language may have only 

pragmatic rules, and some may be dependent on each speaker's 

subjective decision. Chung and Timberlake posit three types of mode: 

"epistemic mode", "epistemological mode"; and "deontic mode". The 

difference between their "epistemic mode" and "epistemological mode" 

is firmly within the scope of this dissertation. They characterize each 

mode as follows: 

The epistemic mode characterizes the event with respect to the 
actual world and its possible alternatives. If the event belongs to 
the actual world, it is actual; if it belongs to some possible 
alternative world (although not necessarily to the actual world) it 
is possible; and so on. 

Two subtypes of epistemic mode are often distinguished: 
necessity (the event belongs to all alternative worlds) and 
possibility (the event belongs to at least one alternative world). 
These subtypes are illustrated by one sense of the English modal 
auxiliaries; consider John must be in Phoenix by now ( = in all 
alternative worlds that one could imagine at this time, John is in 
Phoenix) and John can/may be in Phoenix now ( = there is at 
least one world one could imagine in which John is in Phoenix). 

(1985:242) 

Given that the epistemic mode characterizes the actuality 

of an event per se, it does not include a participant target or 

strictly speaking, a source. 

The epistemic mode can be contrasted with a related mode, 

31 


the epistemological mode, which differs only in that it more 

clearly involves a source. The epistemological mode evaluates the 

actuality of an event with respect to a source. The event may be 

asserted to be actual, or else its actuality may be dependent on the 

source in one of several ways. 

(1985:244) 

Chung and Timberlake claimed to have discovered, in their 
survey of the essentials of tense, aspect and modal in Lakhota, Takelma, 
German, and others, that a speaker uses the epistemic mode and the 
epistemological mode differently. As quoted above, they define 
epistemic mode as the mode that characterizes the situation the speaker 
is describing with respect to both the actual world and another possible, 
non-actual world (i.e. the world of necessity vs. the world of possibility) 
and epistemological mode as the mode that is used to evaluate the 
actuality of the situation with respect to the speaker's source of 
information. Therefore, Chung and Timberlake's "epistemological 
mode" theoretically involves both "evidentiality" and "judgement of 
necessity and possibility" which are separated in the traditional view [21]. Within this "epistemological mode" they proposed four parameters: 

[2-2] Parameters of epistemological modes proposed by Chung and 
Timberlake (1985:244) 

(a) "EXPERIENTIAL"
, 
in which the event is characterized as witnessed or 
otherwise experienced by the "source" (i.e the speaker);2 
(b) "INFERENTIAL"
, 
or "EVIDENTIAL", in which the event is 
characterized as inferred by the speaker from evidence; 
(c) "QUOTATIVE", in which the event is reported from another source, 
told to the speaker by someone else; and 
32 


(d) "CONSTRUCT", the submode in which the event is a speaker's 
construct (thought, belief, fantasy) of the source. 
Parameter (a) is direct evidence, parameter (b) is "judgement of 
necessity and possibility" in the traditional sense, and parameter (c) is 
so-called evidentiality in a traditional narrow sense. Parameter (d) is, 
perhaps, speaker's "judgement", but which is more subjective than (b). 
These four parameters are similar to those of Chafe (1986) presented in 
chapter one. 

At a glance, the distinction between epistemic mode and 
epistemological mode in Chung and Timberlake's term is not so 
straightforward as they claim it to be. On this point, Chung and 
Timberlake state that some languages may "use the same morphology to 
encode the epistemic and epistemological modes, suggesting that these 
modes are concerned with similar types of non-actuality" although "a 
language may express epistemologically uncertain events with 
morphology used basically for epistemic non-actuality" (p.245); 
therefore, the distinction may not be applicable for some languages. 
Examples of both "epistemic mode" and "epistemological mode" from 
Chung and Timberlake's framework may help one understand their 
distinction between the two modes. The following example from 
Takelma is in distinct realis mode with a verb of distinct realis:

 (2-3)Mena yap�fa t�fomo-k�fwa 
bear man kill(REALIS)-3HUMAN OBJ 
(The bear killed the man) 

33 


Chung and Timberlake claim that the next Takelma sentence is in the 
inferential epistemological mode, with a different stem used for all non-
actual moods and a special inferential suffix -kt:

 (2-4) Mena yap�fa tdomo-k�fwa -kt 
bear man kill(IRR)-3HUMAN OBJ-INFERENTIAL 
(It seems that the bear killed the man/The bear must have, 
evidently has, killed the man.---Epistemological) 
The irrealis mode of the above sentence (2-4) (i.e., highly possible 
world) is grammatically contrastive with the "actual world" but, at the 
same time, the mode of inference (based on some evidence obviously) is 
grammatically presented. The next example is from Lakhota (Boas and 
Deloria, 1941 also quoted by Chang and Timberlake) where a verb suffix 
form tkha is analyzed to be used for a "counterfactual but hypothetically 
possible event"; therefore, the case represents the epistemic mode of 
Chung and Timberlake: 
(2-5)Lehayela me-t?a tkha 
now I(SG)-die HYP 
(I could have/almost died. -Epistemic) 
In Lakhota, a simple sentence without any evidential basis to 
support it can only use the realis mode. Willett (1988) argues that all 
four parameters (a) to (d) in [2-2] proposed by Chung and Timberlake 
are "evidential-like" and maintains that he found the same parameters 
in the languages he examined. Willet concludes that inference is "best 
treated as a third major type of evidential, on a par with sensory and 
reported evidence", and that these three "form a set of epistemic 

34 


distinctions that contrast semantically with those of confidence (i.e., 
judgement)" (1988:54). Thus he defines evidentiality as "the linguistic 
means of indicating how the speaker obtained the information on 
which he bases an assertion (and reliability of a speaker's knowledge)", 
which I have adopted as a general definition in this dissertation. 
Therefore, the scope of evidentiality in this study is approximately the 
same as the phenomena termed "epistemic modality" in the traditional 
sense, "epistemological mode" characterized by Chung and Timberlake, 
and "evidentiality" by Willet. The meanings of different types of 
evidentials are summarized by Willett (1988) as follows in [2-6]. 

Please note that the scope of [2-6] still deals with two major types 
of information source: direct (experiential) and indirect 
(inexperiential) evidence; although inexperiential evidence involves 
more than hearsay being different from popular "lay" understanding of 
evidential, i.e., evidentials equal hearsay. Therefore, logically, what an 

"evidentiality-conscious" speaker does is to involve information about 
his information source (direct, reported-indirect, or inference-
indirect) into the modality of his proposition. 

35



[2-6] Meanings of grammatical evidentials by Willett (1988:96) 

I. Direct evidence: the speaker claims to have perceived the situation 
described, but may not specify that it is sensory evidence of any kind. 
A. Visual evidence: the speaker claims to have seen the situations 
described. 
B. Auditory evidence: the speaker claims to have heard the situations 
described. 

C. Sensory evidence: the speaker claims to have physically sensed 
the situation described. This can be viewed as (a) in opposition to 
one or both of the above senses(i.e. any other sense), or (b) 
unspecified as to sensory mode (i.e. any sense). 
II. Indirect evidence: the speaker claims not to have perceived the 
situation described, but may not specify whether the evidence he does 
have is reported to him or is the basis of an inferences he has made. 
A. Reported evidence: the speaker claims to know of the situation 
described via verbal means, but may not specify whether it is 
hearsay (i.e. second-hand or third-hand), or is conveyed through 
folklore. 
1. Second-hand evidence: the speaker claims to have heard of the 
situation described from someone who was a direct witness. 
2. Third-hand evidence: the speaker claims to have heard about 
the situation described, but not from a direct witness. 
3. Evidence from folklore: the speaker claims that the situation 
described is part of established oral history. 
B. Inferring evidence: the speaker claims to know of the situation 
described only though inference, but may not specify whether such 
inference is based on observable results or solely on mental 
reasoning. 
1. Inference from the results: the speaker infers the situation 
described from his observable evidence. 
2. Inference from reasoning: the speaker infers the situation 
described on the basis of intuition, logic, a dream, previous 
experience, or some other mental construct. 

36



However, as I wrote earlier, the simple difference between direct 
and indirect is not enough to explain Japanese evidentials which seem 
to involve not only the speaker's knowledge but also the hearer's 
knowledge. In a later section I will attempt to incorporate the view 
from Revisionist Epistemology (Givon, 1982) that emphasizes the 
influence created by the hearer's knowledge of the speaker's 
proposition. Also, some indirect evidence (both reported and inferred) 
in [2-6] can be treated as direct evidence (in a sense) by a speaker in 
discourse depending on how intimate the speaker feels about the 
proposition. This Japanese concept of direct evidence will be explained 
by the concept of the speaker's psychological information territory. 
These factors regarding the Japanese concept of evidence require a 
unique evidentiality system framework that is not fully explainable by 
what is assumed to be the universall standard concept summarized in [26]. 

EXAMPLES OF GRAMMATICIZED EVIDENTIALS 

Before getting into the evidentials in the Japanese language, it 
would be useful to look at some examples of systems of evidentiality from 
various languages. Most languages do not have a grammaticized system 
of evidentials as some languages have; often evidentiality expressions 

reflect a speaker's subjective judgement so that a speaker is, 
theoretically speaking, free to choose his own system. English, as well 
as Japanese, belongs to this "free" group. 

37



The following example is from the Tuyuca language (Brazil and 
Columbia) investigated by Barnes (1984). The following sentences in [27] can all be translated into English as "he played soccer." Tuyuca cases 
have been quoted in many studies since the language shows a clear 
example of a grammaticalized evidential system. 

[2-7] Tuyuca evidentials 

(a) 
diiga ape-wi (I saw him play. ------------visual) 
he play-evidential 
(b) 
diiga ape-ti 
(I heard the game and him, but I didn't 
he 
play-evidential see it or him.------senses other than 
visual) 

(c) 
diiga ape-ye 
(I have seen evidence that he played: 
he 
play-evidential his distinctive shoe print on the 
playing fields. But I did not see him 
play. -----------------------apparent) 

(d) 
diiga ape-yigi (I obtained the information from 
he play-evidential someone else. --------------hearsay) 
(e) diiga ape-hiyi 
(It is reasonable to assume that he did. 
he 
play-evidential -------------assumed) 
(Palmer, 1986:67) 

Palmer commented that the Tuyuka system is a case of 
grammaticalized "pure" evidentials (1986:67). It is reported that in 
Tuyuca, morphological forms of the verbal tense/person suffix function 
to indicate the source of information which the speaker's proposition is 
based on. Two types of direct evidence, (a) and (b), and three types of 
indirect evidence (c), (d), and (e) in [2-7] are encoded in the grammar. 
Since this is a part of the grammar of the language, speakers of Tuyuca 

38 


are required by the grammatical system to articulate the source of 
information. 

The next example of grammaticized evidentials is from Kogi 
(Chibchan, N. Columbia) studied by Hansarling (1982), and discussed also 
by Palmer (1986). The grammar of this language requires its speaker to 
be conscious of the hearer's knowledge. If a speaker judges his 
proposition to be known to both parties, he has to use the particle ni 
("reminding"); if he assumes that his proposition is not known to the 
hearer, the na particle is used to indicate that the speaker is 
"informing". In case the speaker does not have a certain piece of 
information and assumes that his hearer has that information, he uses 
the shi particle ("asking"). If the speaker does not have a certain piece 
of information and he assumes that his hearer does not know either, the 
speaker uses the modality of the skan particle (expression of "doubt"). 
And if he is not sure if his hearer has information that he does not 
know, he is required to use the modality of the ne particle 
("speculation"). A summary chart of the Kogi system is shown in [2-8] 
with sentencial examples. In the following figure, "+" indicates that the 
information is known, while "-" means that information is not known. 

39



[2-8] Kogi evidential system (Palmer, 1989: 76) 

Evidential particles Speaker Hearer Function of evd. 
(a) ni + + remind 
(b) na + -inform 
(c) shi -+ ask 
(d) skan --doubt 
(e) ne -? speculate 
Sample sentences using (a) - (e) evidentials:
(a') ni-gu- ku- a. (I did it just a while ago, as you know - remind)
(b') na- gu-gu. (I tell you he did it some time ago - inform)
(c') shi- na (Is that the way it is? - ask)
(d') shag-gu (Who knows if it did just now? - doubt)
(e') nabbi no guste ne ha gna (I wonder if it is a small lion, he 


thought - speculate) 

The evidential system in Kogi indicates "who knows what about 
the situation being discussed" (Hansarling, 1982:52 quoted by Palmer, 
1986:76). Interestingly, this system is related to the psychological 
concept of information territory of a speaker which is, in this 
dissertation, introduced to conceptualize the Japanese system of 
evidentiality (see later section of this chapter). The most significant 
difference between Kogi and Japanese is that Kogi evidentials are 
grammarticized where those of Japanese are not. 

Palmer (1986) also suggested that Nambiquara (Brazil) is another 
example of a language in which epistemic modality is grammaticalized 

40 


in that various combinations of the speaker's and the hearer's 
knowledge are used as indicators of different epistemic modality. Lowe 
(1972) analyzed rather complex evidentiality in Nambiquara as a two-
dimensional system with an "individual mode" and "collective mode" for 
event verification: 

speaker orientation : observation, deduction, or narration 

event verification : individual or collective verification 

According to Palmer, the speaker-orientation system of 
Nambiquara is equivalent to an evidential system: "observation" means 
sensory acquisition of information, "deduction" means existence of 
enough evidence for the proposition, and "narration" is hearsay speech. 
Event verification should be applied to each type of "speaker 
orientation". Therefore there are six matrices in Nambiquara's 
evidentiality system as summarized in [2-9] below: 

[2-9] Nambiquara evidentiality system 

(a) Individual observation: "I report to you what I saw the actor 
(=subject) doing." (e.g. He worked.) 
(b) Individual deduction: "I tell you my deduction of an action 
that must have occurred because of 
something I see or saw." 
(e.g. He must have worked.) 
(c) Individual narration: "I was told by someone that a certain 
action occurred." 
(e.g. I was told that he worked.) 
(d) Collective observation: "I report what both I and the addressee 
saw the actor doing." 
(e.g. Both you and I saw that he 
worked.) 

41



(e) Collective deduction: "From what the speaker and the 
addressee saw, they deduce that a 
certain action must have taken place." 
(e.g. He worked, as deduced from what 
we saw.) 
(f) Collective narration: "Both speaker and addressee were told 
that a certain event took place." 
(e.g. It was told us that he worked.) 

The system of Nambiquara is different from that of Kogi in that it 
does not involve information which is known only to the addressee (the 
hearer). In Nambiquara, the speaker is required to pay attention to 
whether information is known only to the speaker or known to both 
parties. 

The cases of Tuyuka, Kogi and Nambiquara support the 
Revisionist Epistemology theory of Givon (1982) in that the existence of 
the hearer is an influential factor in evidentials in these languages. As 
is noted earlier, in the traditional idea of epistemology, the essence of 
the sentential mode was a matter of true or false. Therefore, 
traditionally, neither the speaker's subjective certainty nor the 
existence of hearers was considered to be important in theories of 
evidentiality. However, truth is rarely absolute. As Chafe claimed, "the 
study of evidentiality is about the human awareness that truth is 
relative, and particularly about the ways in which such awareness is 
expressed in languages" (1986: vii). In modern times, attempts have 
been made to show that at the bottom of propositional/sentential 
modalities lies an implicit contract between the speaker and the hearer. 

42 


From this perspective, Givon (1982) proposed to categorize propositions 

into three types: 

[2-10] 

(a) propositions which are to be taken for granted, via the force 
of diverse conventions, as unchallengeable by the hearer and 
thus requiring no evidentiary justification by the speaker; 
(b) propositions that are asserted with relative confidence and 
open to challenge from the hearer and thus require---or admit-
-evidentiary justification; and 
(c) propositions that are asserted with doubt as hypotheses and 
thus beneath both challenge and evidentiary substantiation. 
They are, in terms of the implicit communicative contract, "not 
worth the trouble". 
(1982:24, italics in the original) 

As suggested above, for Givon, the knowledge level (the degree of 

necessity of the proposition) of the speaker and the hearer matters in 

deciding the necessity of evidentials. Givon rejected the concept of 

linguistic sentential modality which had been under the influence of 

the classic Platonic tradition, i.e., the traditional view of epistemology in 

which the essence of mode is whether the proposition is true or false by 

virtue of various modes of access to truth or knowledge. Givon stated: 

This [Platonic] tradition has derived the bulk of its support from 
linguistic analysis of a distinct kind: Propositions are considered 
in isolation from each other as to their truth and epistemic status. 
Sentential modalities thus appear to be an objective matter, to 
which neither the speaker nor the hearer--the two participants 
in the communicative transaction in which human language is 
actually used---bear any relevance. The recent renaissance in 
the study of communicative pragmatics has so far made nary a 
dent in this tradition. The speaker's subjective certainly is not 
considered seriously in traditional epistemology, but rather 
relegated to the realm of psychology. The hearer's role in the 
communicative transaction is not even contemplated. (p.24, 
italics in the original) 

43 


Consequently, the speaker's subjective certainty is an inferential 
by-product of the evidentiary, experiential aspect of knowledge, 
while the logician's "truth" is again an inferential by-product of 
both evidentiary source and subjective certainty. (p.25, italics in 
the original) 

Lyons (1977) also made a similar distinction between subjective and 
objective types of epistemic mode (as well as deontic mode): in his 
theory, the objective epistemic mode is a matter of degree of necessity, 
and subjective meaning is evidential by nature, but he did not elaborate 
on this concept. 

I think that Givon made two particularly noteworthy points: first, 
we should realize that we are dealing with the speaker's subjective 
certainty in dealing with necessity and possibility of the proposition 
which we assume to be objectively measurable; second, the speaker 
certainly pays attention to the hearer in choosing the evidentials since 
the chosen evidentials indicate the speaker's subjective certainty that 
might be offensive to the hearer in some way. 

I believe that these theories and analyses of sentential modality 
are also useful in analyzing discourse modality; they provide us with 
good understanding of modal meanings of isolated words and phrases 
which can be utilized in a larger scope of discourse modality. Givon's 
view is in line with the theories of "discourse modality" (e.g. Maynard, 
1993) in arguing that a theory of sentence modality does not always 
reflect actual language use. This point will be elaborated on in later 
sections. 

44 


STUDIES ON JAPANESE MODALITY 

Interestingly enough, earlier this century, some Japanese 
linguists proposed that a sentence has propositional and modal contexts 

(e.g. Tokieda, 1950, Hashimoto, 1948). The idea was similar to Fillmore's 
later proposition (1968), although naturally the linguistic form of 
Japanese modality is different from that of English. English modals are 
easy to understand due to their close relationship with auxiliary verbs 
(e.g. do, have, shall, be, will, may, ought) which are morphologically 
independent. The functions of English auxiliary verbs are defined to 
express tense, person, number, and mood in accompanying and helping 
another verb. In Japanese, jo-dooshi (
. . . 
fi
. 
) are closest to English 
auxiliaries in their function.3 However, since the Japanese language is 
"agglutinative" by nature (cf. English is "inflectional"), the Japanese jodooshi are not morphologically independent, but usually attached to the 
main verbs or adjectives in a way that they look like a part of the main 
lexical item's conjugation. Hashimoto (1948) viewed jo-dooshi as 

independent lexical items. He suggested two types of jo-dooshi : those 
attached to nouns and adjectives, and those attached to verbs.4 
Hashimoto proposed the concept of bunsetsu (phrase) in that, as he 

argued, jo-dooshi--together with the main lexical item which it is 
attached to--constitute a bunsetsu (phrase), and one or more bunsetsu 
constitute a bun (sentence). Tokieda (1941, 1950), in conjunction with 
his grammatical theory "gengo katei setsu" (theory of language as a 

45 


mental process), proposed to divide sentences into two parts: shi ( ..) 
(objective/subjective notions such as book, sad, etc.) and ji ( .��)(concept 
outside of objectifiable expressions). For Tokieda, shi is a result of 

"abstraction" (e.g. the word "book" is not the same as the object "book" 
but a linguistic abstraction of the object "book"), while ji directly 
represents a speaker's position which is in the abstraction process. In 
the following sentence, for example, yuki ga furu (snow falls) is shi, 
and kamoshirenai (might, perhaps) is ji: 

Bun (sentence) 

[2-11] shi ji 

Yuki ga furu -kamoshire-nai 

snow NOM fall AUX(might) 
(It might be snowing.) 

Kamoshire-nai expresses the speaker's view of shi (an objective 
event): yuki ga furu (snow falls). Thus, Tokieda claimed that jo-dooshi 
is an independent part-of-speech category, and shi and ji have different 
functions in that shi are "enveloped" in ji (1955:278). Tokieda also 
proposed to include verbal- adverbial- and nominal-suffix into the shi 
constituent as a setsubi-go (suffix) as shown in the following example: 
[2-12] Sentence 

shi ji 

Taroo wa sushi o tabe -nakat -ta -rashii.
. 
Taro TOP sushi ACC eat NEG PAST seem


jodooshi setsubigo jodooshi 

46 


(It seems that Taro did not eat sushi.) 

In [2-12], the negative auxiliary, -nakat-, is a part of shi (i.e., 
proposition). Thus, in Tokieda's view, jo-dooshi (i.e., Japanese AUX) is 
not always in the ji-phrase (i.e., modal) while in English, auxiliary 
verbs are usually modal. Tokieda's theory influenced subsequent 
research on Japanese syntax. There are differences among the 
researchers' concept of models; however, all seem to agree that a 
sentence has a modal constituent to "envelop" propositional context: 
Tokieda's ji, Yamada's chinjutsu (1951), Mikami's muudo (1963), 
Teramura's muudo (1979), and Nakau (1976) and Nitta's (1989) modality, 
all describe the same linguistic phenomena. Thus, it seems that the 
dichotomy of propositional content and modal content has been adopted 
in Japanese. Tokieda claimed that modal content syntactically involves 
all the constituents from a tense-marker to the end of the sentence and 
functions to express a speaker's subjectivity toward his proposition. This 
point also seems to have been adopted by other Japanese linguists, but 
exactly what should be included in modality is a topic of ongoing 
discussion. 

Nakau (1976) has defined modality as a speaker's psychological 
attitude at the time of speech. Nakau included tense, aspect, negation, 
question, and complementation in the domain of propositional content, 
and therefore, meant that modality content exists outside of this domain. 
Masuoka (1989) claimed that modality can exist in every constituent of a 
sentence, meaning that modality is also in propositional content. He 

47 


wrote that apart from the speaker-subjective "primary modality", a 
sentence has a "secondary modality" which can be objective. According 
to Masuoka, secondary modality includes politeness, transmission of 
thoughts, judgement, explanation, topicalization, and other functions in 
addition to traditional modal function (e.g. tense, aspect, and negation). 
It seems that the scope of Japanese modality is still unclear at least 
partly because the acknowledged definition of modality, "speaker's 
psychological attitude", can be interpreted in various ways involving 
numerous linguistic and psychological phenomena of language, but at 
least sentence-final ji is generally acknowledged as modality. 

STUDIES ON JAPANESE EVIDENTIALITY 

There are only a few studies which have focused on Japanese 
evidentiality per se (e.g. Aoki, 1986; Watanabe, 1984), although some 
studies of modality superficially refer to evidentiality (e.g. Nitta and 
Masuoka, 1989; Nakau, 1976). In the traditional view of sentential 
modality with a focus on auxiliaries, there are seven major modal 
auxiliaries that express epistemic modality, or epistemology (i.e., 
evidentiality, in this study), which qualifies a speaker's commitment to 
the truth of the proposition. Among them, four auxiliaries express 

modality of epistemic judgement:5 
[2-13] Japanese auxiliaries of epistemic judgement 

Auxiliary Meaning 

hazu ---------------strong logical conviction, equivalent to 

48 


English must be, be expected to; 

ni-chigai-nai-------subjective sense-based inference, 
equivalent to English must, without a 
doubt; 

daroo --------------judgement of probability, equivalent of 
English probably; 

kamo-shire-nai ----judgement based on weak evidence, 
equivalent to English may be, might be. 

These auxiliaries in [2-13] are used to express "inferences" in that the 
proposition is based on some kind of warrant. Here, inference includes 
ones from results and reasoning (as in [2-6] in this chapter) which 
approximately covers inferential functions of so-called "deduction", and 
"induction", and perhaps "assumption", and "speculation" in the wider 
scope. 

Each auxiliary word expresses a different degree of 
necessity/non-actuality as well as speaker subjectivity. Johnson 
(1994:90) showed the following figure to indicate the possible 
relationship of necessity/possibility and speaker subjectivity: 
[2-14] Possibility Necessity Hypotheticality Subjectivity 

hazu 

(must) 
ni-chigai-nai 
(must) 

daroo 
(conjectureprobably) 

kamo-shire-nai 
(might) 

49



In [2-14], we see that the "necessity" of the proposition is 
inversely proportional to "non-actuality" (i.e., "possibility"), 
"hypotheticality", and "subjectivity" ("a speaker's degree of conviction" 

by Johnson)6. This intuitively makes sense. Hazu implies the existence 
of strong evidence in the speaker's mind which allows him to make a 
strong deduction of necessity of the proposed event; therefore, in a 
sentence with hazu , degree of hypotheticality, possibility, and 
subjectivity of the proposition is very low; so, at the sentencial level, 
hazu should be used when the highest necessity is guaranteed. 

However, it might not so at the discourse level. For example, it 
can be assumed that the implication of the speaker's strong confidence 
attached to hazu (must) or ni-chigai-nai (no doubt) tends to be avoided 
when a speaker would like to be less assertive. As a result, it is 
presumable that, in discourse, hazu (must) and chigai-nai (no doubt) are 
followed by certain kinds of sentence-ending modalities to decrease the 
level of evidentiality. 

In the same epistemic modality group, four main modal 
auxiliaries are traditionally (in a limited sense) considered to express 
epistemic evidentiality as defined for this dissertation, which are often 
called "hearsay evidentials". A brief explanation of hearsay evidentials 
is as follows: 

50



[2-15] 

soo 
(1) conveys second-hand information obtained 

directly or indirectly through any channel, 

equivalent to English I heard or I read or I was told; 

(2) expresses a speaker's conjecture about future or 
present events based on the information he obtained 
through sensory impression, equivalent to English 
it appears; 
yoo/mitai 
(1) expresses a speaker's suppositional judgement, 
equivalent to English it looks like, 

(2) expresses counter-factual impressions; 
rashii express a speaker's conjecture based on it seem, it
secondhand information, equivalent to English 
looks like, or I heard; 

The first Japanese auxiliary of hearsay is soo. Soo is usually used 
with a copula as in sooda (plain), soodesu (polite). Soo (da) is used with 
two different meanings: "hearsay soo(da)" and "conjecture soo(da)". 
When preceded by tensed forms, a sentence with hearsay soo(da) 
conveys secondhand information obtained directly or indirectly by the 
speaker through any channel (e.g. hearing, reading) without any 
alteration by the speaker's subjectivity. As in the following example, in 

a hearsay soo (da) sentence, syntactically, the entire predicate before 
soo (da) is usually secondhand information: 
(2-16) 
Shinbun 
Newspaper 
ni yoru to 
according to 
Furorida ni 
Florida TEMP 
yuki 
snow 
ga futta 
NOM fell 
sooda. 
hearsay 

(According to the newspaper, it snowed in Florida.) 
(Makino, 1986:499) 

51 


In sentence (2-16), the part before the auxiliary sooda, i.e., "Furorida ni 
yuki ga futta" (it snowed in Florida) is hearsay information. Of course, 
Japanese has a verb phrase, S to kiita, which literally means I heard S . 

So the meaning conveyed by the next sentence (2-17) does not differ at
all from sentence (2-16) except that the means of information gathering
(audio) is more explicitly stated in (2-17)
:
(2-17)


Furorida ni yuki ga futta to kiita yo. 

Florida TEMP snow NOM fell QUOT heard VOC 
(I heard that it snowed in Florida.) 

The other meaning of soo(da) is that of conjecture. Soo(da) can 
be an auxiliary adjective which indicates that what is expressed by the 
preceding sentence is the speaker's conjecture concerning an event in 
the future or the present state of someone or something based on the 
speaker's visual or other sensory impression, or intuition (Makino et al., 
1986:410). "Conjecture soo(da)" occurs after the stem form of adjectives 
and verbs, and means appears to be. Syntactically, adding soo(da) to 
adjectives and verbs converts them into adjectival nouns. Observe the 
following example (2-18): 
(2-18) 

Furorida ni yuki ga furi sooda yo.

 Florida TEMP snow NOM fall(INF) appear VOC 
(It appears/looks like it will snow in Florida.) 

52



Conjecture soo(da ) does not necessarily require the speaker's 
commitment to the proposition, thus "cancellation" of the proposition is 

possible: 
(2-19) 
Furorida 
Florida 
ni yuki ga 
TEMP snow NOM 
furi 
fall(INF) 
soo -datta 
appeared 
kedo 
but 
fura-nakatta 
fall -(NEG)(PAST) 
ne. 
CONF 

(It appeared/looked like it would snow in Florida, but actually it 

didn't-as we know) 

Hearsay soo(da) does not involve a speaker's commitment to the 
truth of the proposition. For this reason, there is an opinion that 
hearsay soo(da) should be excluded from the epistemic modality since it 
does not involve speaker's supposition with regards to the necessity of 
the proposition (e.g. Johnson, 1994). I consider this view to be 
appropriate for the sentence-level epistemology. But, from the 
pragmatic point of view, hearsay soo(da) certainly functions to present 
mood, since a speaker uses soo(da) when he does not want to commit 
himself to the necessity of the proposition, i.e., he is expressing 
reservation about the proposition or about the people to whom he is 
presenting the proposition. Further, from the evidentiality point of 
view, soo(da) is indispensable in representing the mood of "lack of 
direct evidence". For these reasons, I have included soo(da) in the 
genre of Japanese epistemic modality (and it actually turned out to be a 
very frequent mood-indicator in Japanese discourse data). 

53 


The second so-called hearsay auxiliary is yoo, an adjectival noun 
which is also usually used with a sentence-ending copula, da or desu. 
Yoo(da) also has two major meanings: suppositional judgement and 
metaphor. First, "suppositional yoo(da)" expresses a speaker's 
suppositional judgement in cases where the speaker does not have solid 
evidence to argue that his proposition is true, but for some reason, 
supposes it must be very close to the truth (e.g. Teramura, 1984). The 
following sentence is an example of suppositional yoo(da): 
(2-20) 

Doomo, sore ga umaku ikana-katta yoo-na no ne.

 somewhat that NOM well go(NEG)-PAST appear-STEM VOC RAPP 
(It somewhat appears that it did not go well.) 

Yoo(da) and mitai(da) function in the same way. They are almost 
interchangeable, but mitai(da) is more colloquial than yoo(da): 

(2-21)
Are, yappari dame datta mitai yo.
that as is expected no-good COP(Past) appear VOC


(It appears that 'that' did not work as had been expected.) 
Yoo(da) and mitai(da) are also used in counter-factual situations to 
indicate metaphoric observation as in the next example, (2-22). However 
metaphoric yoo(da) and mitai(da) are used when the speaker knows the 
truth value of his proposition, so they are not indirect evidentials. 

54



(2-22) 
Sucotto-san -tte marude Nihon-jin mitai desu ne. 
Mr. Scott QUOT as if Japanese appear COP COMF 
(Mr. Scott is just like a Japanese person-although he is not.) 

The fourth hearsay evidential, rashii indicates the preceding 
predicate to be the speaker's conjecture based on second-hand 
information, such as what he has heard, read, and seen. An English 
equivalent to rashii is it appears, I heard or it looks like. Rashii 
expresses a speaker's conjecture based on some kind of reliable 
evidence. In this sense, rashii functions in a very similar way to 

"suppositional" yoo(da) and mitai(da). 
(2-23) 
karuforunia -tte sugoku ie 
California QUOT very 
ga takai 
house NOM expensive 
rashii 
appear 
no 
VOC 
ne. 
RAPP 

(It appears that houses are very expensive in California.) 

However, as is noted, yoo(da) is often based on sensory 
information (visual information, in particular) while rashii is based on 
the information the speaker obtained in any numbers of ways from the 
environment. Makino et al. (1986) suggested that if there is relatively 
little conjecture in the speaker's mind, rashii is almost the same as the 
hearsay sooda, as is the case with the above sentence (2-23) in which 
the information (i.e. houses are expensive in CA) is widely known. 

We have seen so-called hearsay evidentials: soo(da), yoo(da), 
mitai(da), and rashii. It should be noted that it is wrong to simply call 

55 


this group of auxiliaries hearsay evidentials. Although all evidentials 
are based on information outside the speaker, with each auxiliary, the 
degree of the speaker's supposition involved and emphasis on sensory 
fields through which information is obtained are different. Hearsay soo 
(I heard) indicates that the speaker is simply conveying information 
that he obtained "as-is" without his manipulation; so, the speaker is not 
responsible for the truth value of the proposition when he uses soo(da). 
Therefore, hearsay soo sentence is least subjective. Rashii (it seems) is 
very similar to hearsay soo(da), but it differs from soo(da) in that it 
involves the speaker's supposition. Yoo(da) /mitai(da) (it looks like) 
also deal with information conveyance with the speaker's supposition. 
Yoo(da) is the auxiliary that a speaker uses in emphasizing the visual 
aspect of the information. The other soo(da) (i.e., "conjecture soo" ) also 
has an emphasis on visual and other sensory impressions on which the 
speaker bases his conjecture. But it differs from yoo(da ), in that 
speaker does not commit himself to the truth of his conjecture; he 
simply states his conjecture from what he has seen. 

Teramura (1984) attempted to measure degrees of the speaker's 
presupposition involved in the auxiliaries on a 3 point scale. He ranked 
"conjecture daroo" (probably), and "conjecture soo" (appears to be) at 3 
(highest involvement), yoo (appears to be) at 2, rashii (seems to) at 1, 
and "hearsay soo" (I heard) at zero (p.260). 

These auxiliaries of evidentiality do not represent the entire 
epistemic modality but they are only part of it; there are numerous 

56 


other expressions of modality even at the sentence level across 
grammatical categories such as adverbs, adjectives, particles and 
hedges, and other specific semantic areas regardless of grammatical 
categories. Aoki (1986) paid attention to a specific semantic area, 
Japanese expressions of "sensation", the area in which evidential-like 

expressions are fairly grammaticalized.7 Japanese grammar requires its 
users to make a syntactic distinction between the description of a 
sensation experienced by the speaker and a sensation experienced by 
someone (or something) other than the speaker. When the speaker 
makes an inference regarding the feeling of others, it is necessary to 
add the verbal suffix, -garu, as in (2-27) below:

 (2-25) Watashi wa atui. (I am hot)
. 
I TOPIC hot


 (2-26)* Kare wa atui. (He is hot). (*ungrammatical) 
He TOPIC hot


 (2-27) Kare wa atu-gatte-iru. (He is hot.
) 
He TOPIC hot STATIVE


Since kare (he ) is the third person, sentence (2-26) is not 
grammatical. Sentence (2-27) with -gatte (gerundive form of garu) + 
iru (stative, non-past) is grammatical. Aoki explains that -garu has the 
function of expressing inference (based on indirect evidence) rather 
than direct experience.8 He supported his point by arguing that 
Japanese mimetic words expressing pain (usually adverbs) such as 

57 


chikuchiku (pricking), gangan (pounding), shikushiku (throbbing), 
and zukizuki (throbbing surface wounds) cannot be used with -garu. 
Pain may be perceived as something a person feels directly, so mimetic 
adverbs cannot be used with a third person subject as demonstrated in 
the following ungrammatical sentence (2-28).

 (2-28) *Kare wa zukizuki ita -gatteiru. 
He TOPIC throbbingly pain


(He has a throbbing pain.) 

From the perspective of evidentiality, it is reasonable to assume 
that these expressions of sensation have been generally accepted as part 
of grammar due to the inherent difficulty of "knowing" other people's 
sensory feelings. A proposition such as he is hot is hardly attainable 
except in the case of literary texts in which a speaker (i.e., narrator) is 
supposed to be omniscient and knows all the characters' inner thoughts 
(cf. Banfield, 1982). 

Aoki also pointed out the function of the Japanese noun no (or 
"n", which is often called "nominalizer-no") as an evidential marker of 
fact. He noted that no may be used "to state that the speaker is 
convinced that for some reason something that is ordinarily not directly 
knowable is nevertheless true" (p. 228). For example, as shown earlier, 
the following sentence (2-29) is ungrammatical. But if the speaker adds 
no to the end, as in (2-30), the sentence will imply that the speaker has 
some evidence to assert that he is hot is a "fact". Perhaps, the speaker 

58 


might have witnessed the referent kare sweating or have heard kare 
complain about heat (evidence). 

(2-29)* Kare wa atui. (He is hot). 
He TOPIC hot 

(2-30) 
Kare wa atui no da. (I know that he is hot). 
He TOPIC hot NML COP 

Aoki comments that semantically no "removes the (preceding) 
statement from the realm of a particular experience and makes it into a 
timeless object. The concept becomes nonspecific and detached." (p. 
229) I think what Aoki meant is that the propositional part of (2-30) 
kare wa atsui (he is hot) is presented as a fact in the speaker's 
interpretation by using the no-da sentence ending. Therefore, for Aoki, 
no can present the speaker's subjective judgement based on some kind 
of strong evidence. Actually, the function of no-da (or no-desu) seems 
varied even within a limited scope of evidentiality markings without 
being limited to Aoki's analysis (see chapter four and five for more 
discussion on this topic). 

So far examples of evidentials from auxiliaries and other areas 
have showed fairly "explicit" evidentiality in terms of lexical meanings. 
There are also discussions of the "implicit" phenomena of modality in 
Japanese. Akatsuka (1978, 1985) found that Japanese subjective 
judgment lies in subtle ways of using words such as the selection of 
conditionals and complementizers. Akatsuka paid attention to the aspect 
of epistemology of the speaker influencing sentence structure and 

59 


claimed that Japanese conditionals can be arranged on a scale of irrealis 
(hypothetical non-actual world).9 Iwasaki (1993) proposed the concept 
of "information accessibility" between a speaker and his proposition. 

Iwasaki claims that the speaker's awareness of how accessible his 
proposition is to him determines the speaker's choice of linguistic 
modality (i.e., tense, in particular). He found that a speaker tends to use 
more present tense in talking about a third person's past events than in 
talking about his own past events (cf. historical present tense by 
Wolfson, 1982). According to Iwasaki, a possible deduction is that a 
speaker usually has good knowledge about his own experience in the 
past which forces him to use past tense according to prescriptive rules: 
one's own information is more accessible than others'. These studies of 
"speaker subjectivity" (e.g. Iwasaki, Akatsuka) or "speaker 
epistemology" (e.g. Akatsuka) are related to the issues of evidentiality in 
that even at the sentence level, the grammatical structure of an 
utterance is partly a product of subjective judgement of the speaker. 

So far, we have briefly reviewed the studies of Japanese 
evidentials within the scope of sentence level modality. The existing 
theoretical scope of sentence modality is very limited in that it does not 
involve analysis of speech events, in particular, the existence of a 
hearer. As the theory of discourse modality (e.g., Maynard, 1992) 
suggests, it is necessary to broaden the focus of evidentiality 
phenomena when we deal with natural use of language. 

60 


FROM SENTENTIAL MODALITY TO DISCOURSE MODALITY 

Language users do not need to incorporate auxiliaries and other 
evidential expressions in "factual" statements which are 
unchallengingly true to everyone (cf. Givon, [2-10]). When this is not 
the case, a speaker often wants to show that he does not guarantee the 
truth value of his proposition one hundred percent by adding some kind 
of marker of epistemic modality. Therefore, theoretically, in an extreme 
case, if a speaker only talks about "facts" (not only in his understanding 
but widely known to be so), he does not need any kind of markers of 
epistemic modality. But is this possible? At the level of sentence 
grammar, it may be so; but at the discourse level, it is not. Even if a 
statement is known to be a simple fact, we certainly have occasions in 
which we feel some additional marker of epistemic modality will do good. 
One plus one is two is a logical fact known to most of us. However, in 
some kind of speech situations, we certainly say this phrase with some 
marker of modality added, for example, Isn't one plus one two? 
(although the statement One plus one might be two is rarely used). 
Imagine the case in which you know that your hearer is forty years old, 
and the hearer knows you know that. Then, if it is necessary to remind 
the hearer that he is a grown-up, perhaps the statement You are forty 
years old can be said, but you might add some epistemic "flavor" to it 
depending on when, where, and whom you are talking to. Aren't you 
forty years old?, You must be forty years old by now, or I thought you 

61 


were forty years old often sounds better than the declarative You are 
forty . Even though it is not true at all to say that generative and 
discourse grammar are mutually exclusive (Chomsky, 1980), a discrete 
concept of discourse grammar must be necessary in order to deal with 
the issues of pragmatic language use (e.g. Teratsu 1983, Inoue 1983). As 
Ricoeur (1981) said, discourse has a particular speaker or writer, a 
particular hearer or reader, and is made at a particular time, in a 
particular world. These traits of discourse naturally make its acceptable 
features distinctively different from the ones in Saussure's concept of 
"langue". 

The discourse meaning of epistemic modals differs from their 
meaning at the sentence level. The hearsay marker rashii (it seems) is 
said to be used to indicate that the speaker has obtained the proposition 
from outside and made an inference based on the information, but in 
actual conversational discourse, there are instances in which the 
speaker uses rashii in describing a proposition which he has directly 
obtained and is thus confident of its truth value. Observe the following 
sentences. 
(2-31) 

Kare wa kondo kachoo ni naru -rashii yo.

 He SUBJ this time section-head DAT become seem VOC

 Kinoo buchoo ni soo iwarete-ita no o kiita -nda. 

Yesterday dept-head DAT so told (PASS) STAT NLM OBJ heard N 
COP 

(It seems like he is going to be the section head this time. 

I 

62 


heard he was told so by the department-head yesterday, I tell 
you.) 
In sentence (2-31), the speaker directly obtained the information 

(overheard-auditory direct experience), but his usage of -rashii (seem) 
is quite acceptable. An indirect statement is indeed better than a direct 
statement since the propositions in (2-31) are about a third person's 
matter. In (2-31), the speaker obtained the information by overhearing 
rather than through a public announcement or from the referents 
(buchoo or kare). These factors prevent the speaker from using a direct 
expression although he knows the information is true. A direct 
statement would sound as though the speaker is meddling with other 
people's affairs. 

In this way, the actual usage of evidential markers is not always 
what would be expected from the rules of sentence modality: the matter 
of necessity/possibility of the proposition. The variation of markings is 
largely ruled by pragmatic discourse modality. Recently, Johnson took 
the position that sentence-level modality is "a subcategory of a larger 
picture of modality that is defined as a speaker's psychological attitude" 
(1994:46), meaning that sentence modality is only a part of the 
phenomenon of linguistic modality as a whole. As Maynard (1993) 
proposed, modality should not be limited to the sentence level but 
expanded to the discourse level. At the discourse level, a speaker usually 
has one or more hearers; therefore, knowledge about the hearer(s) will 
have some influence on the speaker's use of evidentials (cf. Givon, 1982 
for the Revisionist view). Further, the speaker needs to be concerned 

63 


with the pragmatic consequence of his statement as it effects his goal of 
communication, his social image, and the relationship between himself 
and his hearer(s). Maynard (1993) suggested that the "modality of social 
interaction" cannot be wholly accommodated within the limited 
framework of previous studies of modality in that "discourse modality is 
a broader notion which includes not only the speaker's attitudes 
expressed by independent lexical items or combinations thereof but also 
those that can be understood only through discourse structures and in 
reference to other pragmatic means" (p. 39). Discourse modality, as 
referred to by Maynard, is, in short, a matter of language pragmatics in 
conversational or speech discourse since Maynard focused on how some 
selected discrete lexical items--for example, discourse connectives 
dakara (therefore) and dakedo (however), sentence-ending da (plain) 
and desu/masu (formal), interactional particle yo and ne--function in 
discourse. It is true that theories of sentential modality were often based 
on conveniently created sentences, or if they were authentic, such data 
was often from a limited range of speech events. The concept of 
discourse modality is a broader view of modality. Certainly, various 
aspects of discourse pragmatics can be viewed from the perspective of 
modality, and, this dissertation should also be considered a study of 
discourse modality. 

64



CHAPTER 2: NOTES 

1Obviously Chafe did not desire to commit himself to an overly 

restricted view of evidentiality. As Willet observed, evidentiality 
marking is so often interwoven with other areas of grammar, 
particularly tense and aspect (also cf. Chung and Timberlake, 1985), that 
to "extract" the "pure" aspect of evidentiality is often difficult. One 
example from Takelma is quoted below from Chung and Timberlake. In 
Takelma, the future tense differs from other modes in that future tense 
cannot be negated simply by adding a negative adverb; negative future 
events are expressed by the inferential mood (i.e., evidentiality) plus 
the negative adverb as in (2-33) (b) below. In Takelma, both "future" 
and "inferential" use the irrealis stem. 

(2-32) 

(a)Yana-?tt 
go(IRR)-3SG(FUTURE) (He will go.) 

(b) 
Wede yana-kt 
not go(IRR)-INFERENTIAL (He will not go/Evidently he didn't go.) 
I have observed that various aspects in modality are interwoven 
in English too. For example, in the sentence I could have done so and 
so , the modal auxiliary can is combined with "past tense" and 
"perfective aspect" resulting in signifying the mode of irrealis, i.e., a 
non-actual world with no-possibility. 

2Often "source" means information source such as the speaker's 

direct sensory experience or somebody else's direct experience. But, 
Chung and Timberlake used the word to mean some entity whose point of 
view characterizes the event as either actual or non-actual: 

65 


For primary events, the source is typically the speaker; it is the 
speaker who identifies the event as actual, or imposes it on the 
addressee, or denies responsibility for its truth, and so on. For 
secondary events the source is typically the subject of the matrix 
clause. For example, with governing verbs of intention ('want', 
'try') or obligation ('order', 'forbid') the subject of the verb 
provides the source of modality for the subordinate clause. (232233) 

Thus, for Chung and Timberlake, the source is speaker's 
subjective certainty if not transferring someone else's viewpoint for 
which the syntactical subject of the sentence is the source. 

3Jo(
. . 
.
j 
help and dooshi (
. fi . 
meansj 
verb. Historically,

means 
.

there have been arguments on whether Japanese jo-dooshi are a part-
of-speech or not. Ootsuka (1904) first introduced the concept of jodooshi into school grammar as a part of speech. Later, some linguists 

(e.g. Matsushita, 1930, Suzuki, 1978) argued that jo-dooshi are not a part-
of-speech in that jo-dooshi simply help a verb to be conjugated and 
constitute a predicate. Hashimoto (1948) and Tokieda (1950) took the 
position that Jo-dooshi are a part-of-speech. Hashimoto proposed the 
concept of bunsetsu (phrase) in which jo-dooshi together with an 
independent lexical item (e.g. verbs, adjectives) constitutes a phrase 
which is treated independently as a new lexical item. 
4Japanese verbs, adjectives, adjectival-nouns and copula are 

conjugated to mark for tense (non-past/past) and affirmative/negative 
alternations for several functional forms such as command, potential, 
imperative, conditional, volitional, passive, causative, and causative-
passive. Each conjugated form has both plain and polite (formal) forms. 
Inflected parts (other than the "core" part) are often jo-dooshi 
(auxiliary) or setsubi-go (suffix). 

66 


5The description of the behavior of Japanese modal auxiliaries 
offered here is very limited. For more information, see Alfonzo (1966), 
Teramura (1984), Makino and Tsutsui (1986), Johnson (1994) and others. 

6According to Johnson's definition, the term "subjectivity" 
indicates the degree of speaker's confidence in asserting that the 
proposition is true. When evidence is strong, the speaker can have a 
high degree of confidence (low or little subjectivity); and when a 
speaker lacks confidence in judging a situation, the judgment becomes 
highly subjective. 

7 Most Japanese evidentiality expressions are not 
grammaticalized; however, some evidential-like aspects seem to be 
grammaticalized although their status is not clear (e.g. Watanabe, 1984). 
These days, expressions of a third party's sensations are being treated as 
grammar rules in many textbooks for Japanese-as-a-foreign language 
classes. 

8 On this point, I disagree with Aoki. I consider that -garu 
expressions are based on a speaker's strong belief or inference which is 
based on his "direct" sensory information such as being directly told 
about the third person's feeling. Watanabe (1984) also discussed the 
verbal and adverbial suffix -garu in expressing sensations of a non-
speaker. Watanabe viewed the phenomenon from the perspective of 
transitivity. He argued that, in Japanese, the construction of NOM-ACC 
has higher transitivity than that of NOM-NOM, and if a statement is 
based on direct evidence, the NOM-ACC of higher transitivity is 
required: 

(2-33)Masao ga kaminari o kowa-gatte-iru. 

67 


Masao NOM thunder ACC fear-DIR-STAT 
(Masao is showing fear of thunder.) 
(2-34) *Masao ga 
Masao NOM 
kaminari ga kowa-gatte-iru. 
thunder NOM fear-DIR-STAT 
(Masao is showing fear of thunder.) 
(2-35) Masao 
Masao 
ga 
NOM 
kaminari ga 
thunder NOM 
(It seems 
kowai rashii. 
fear seem 
that Masao is afraid of thunder.) 

Since word order is fairly flexible in Japanese, particles (e.g. ga, 
and o above) are used to assign cases. Watanabe considered -garu to be 
an auxiliary of direct evidence (cf. Aoki considered that -garu expresses 
a speaker's inference based on indirect evidence) which is only used for 
a high transitivity construction such as in (2-33), accordingly (2-34) 
with the combination of a low transitivity construction and -garu 
results in an ungrammatical utterance. A low-transitivity sentence 
construction is only used for an indirect statement such as (2-35). 
think that Watanabe's theory of the relationship between kinds of 
evidence and sentence transitivity is insightful. Watanabe 
characterized -garu as a direct evidence marker being directly opposed 
to traditional analysis of -garu as an indirect evidence marker. Being in 
agreement with Watanabe, I consider that the so-called indirect suffix 
garu is an evidential of "fact": a speaker can state other people's 
sensation subjectively as a fact (necessarily with some strong evidence, 

e.g. directly hearing from the target person.) This view is deduced 
from many speakers's use of -garu with other indirect evidential 
markers (e.g. rashii, mitaida, yooda, all meaninf seems) suggesting that 
sentence-ending -garu is rather assertive. This fact implies, at least 
pragmatically, -garu is understood as a "fact" marker. Usually, in 
conversation, unless the third person clearly states his feeling to the 
speaker, sentence (2-27) is said with the indirect marker such as -rashii 
as in (2-36) below: 
68 


(2-27) Kare wa atugatte-iru. (He is hot.) 
he TOP hot STATIVE 

(2-36)Kare wa atsugatte iru rashii (It seems he is hot.) 
he TOP hot STAT seem 

I have observed that direct statements (as 2-27) inferring other 
people's sensation based on the speaker's simple observation (e.g. 
finding that someone is sweating) are not often used in conversation. 
Usually, an evidential of a high degree of possibility is added (e.g. rashii, 
yooda, mitaida) in order to mitigate the potential offensiveness of the act 
of talking about someone else's feelings. Therefore, I consider thatgaru is not a complete evidential at the discourse level. Certainly -garu 
indicates the sensation of someone other than the speaker, which 
suggests there is distance between the speaker and the information. But 
if -garu is used as a direct sentence ending, the overall sentence 
modality is direct, implying the speaker's confidence in the proposition. 
This case shows that sentence-final modality may overrides inner 
sentence modality expressions at lease in some cases. 

-Garu allows a speaker to subjectively state other people's 
internal state of mind. This is one example of the subjective aspect of 
the Japanese language. 

9 Each of the four main conditionals in Japanese (nara, tara, ba, 

and to) requires a different semantic environment for its grammatical 
use, but the four share the same meaning, which is equivalent to 
English if, when or whenever depending on the meaning of the 
consequent clause. Conditional expressions do not require the speaker's 
commitment to the proposition because they simply presents possible 
worlds in the conditional clause. Conditionals do not represent 
evidential meanings; thus, they are beyond the scope of this study, but 
they are certainly an important part of Japanese modality. 

69 


70



CHAPTER 3: DISCOURSE MODALITY IN JAPANESE 
In the last chapter, I demonstrated that modals--evidentials in 
particular--need to be investigated on the discourse level in order to 
understand their pragmatic use. On the discourse level, it is speculated 
that the existence of a hearer has a significant influence on the system 
of Japanese evidentiality. In this chapter, the issue of "hearer-
sensitivity" of Japanese discourse, which appeares in the form of 
modality, will be further discussed. 
Discussing Japanese communication style, Clancy (1986) claims 
that, in Japanese culture, the main responsibility of communication lies 
with the listener: the listener must know what the speaker really means 
regardless of what the speaker literally says, however ambiguous, 
indirect, and reticent he may be. In contrast, she argues, in American-
style communication, "the main responsibility for successful 
communication rests with speakers who must know how to get their 
ideas across" (p. 217): the speaker expresses his wishes, needs, thoughts, 
feelings in adequately explicit ways in words rather than indirectly or 
nonverbally. This claim seems to present an overly simple dichotomy 
on both sides. Clancy emphasizes that the Japanese style of 
communication depends on interpersonal "empathy" of a homogeneous 
society in which people anticipate each other's needs, wants and 
reactions without explicit verbal interaction. Clancy's contention 
makes it sound as if Japanese are "telepathic", which is, of course, not 

71 


necessarily the case. However, Clancy is likely correct in her claim that 
Japanese mother-child interaction focuses on the development of an 
empathetic speech-style in child cultural cognition. She suggests that 
Japanese empathetic communication style is a case of the language-
culture relativism view advocated by scholars such as Whorf (1956) and 
Scollon and Scollon (1981). An important remark made by Clancy, 
which is relevant to this dissertation, is that Japanese communication is 
listener-oriented. 

LISTENER-ORIENTED MODALITY AND SENTENCE-ENDING FORMS 

Clancy said that in Japan "Communication can take place without, 
or even in spite of, actual verbalization. The main responsibility lies 
with the listener who must know what the speaker means, regardless of 
the words that are used." (p. 217) As suggested here, the listener may 
have the "responsibility" to correctly determine the meaning that the 
speaker intended to express. In this sense, Japanese communication 
style is listener-oriented because the speaker relies on the listener to 
understand his meaning which may be expressed in ambiguous ways. 

Clancy, perhaps, only paid attention to intentional "contextual" 
ambiguity in Japanese speech. Another important factor of listener-
orientation in Japanese communication, which Clancy did not mention, 
and one that I believe is ultimately more important, is the speaker's 
careful observation of the listener's knowledge level. How does a 
speaker indicate his observation of the listener's knowledge? Sentence

72 


ending modality marker functions to do this. It has been pointed out 
that the sentence-ending modality provides the strongest marker of 
mood in a Japanese sentence. Theoretically, a sentence can have several 
modals, but the mood of the last modal is usually accepted as the sentence 
modal. For example, as explained in the last chapter, the modal of 
hearsay evidentiality, yooda (appear), is the dominant modal in the 
following sentence as "report based on observation": 

(3-1)Kare wa atsugatte iru yooda (It seems he is hot.) 
he TOP hot STAT seem 

In (3-1), kare wa atugatte-iru (he is hot) grammatically presents the 
mode of realis but sentence-ending evidential yooda turns the mood of 
the whole sentence into irrealis. Masuoka (1989) explains that the 
general idea of mood construction in a sentence as follows: 

[3-2] Bun (sentence) 

Meidai Mitomekata Tensu Shingi-handan 
no modality no modality 
(Modality (Modality of (Modality of (Modality of 
of acknowledgement: tense) truth: 
subject ) affirmative/negative) necessity and 
possibility) 

Masuoka points out that there exists a hierarchical relationship 

73 


between modalities within a sentence. As the following diagram [3-3] 
indicates, the last modality of shingi-handan (necessity and possibility) 
holds the responsibility of deciding the final mode of the sentence. In 
this sense, this view is similar to that of Tokieda's (1941, 1950), which 
was introduced in chapter two, [2-12]. In the sentence, ame ga fura 
nakatta rashii (it seems it did not rain), rashii (it seems) presents the 
mode of the sentence as a whole: 

[3-3] Bun (sentence) 

Ame ga fura nak--katta rashii

 rain NOM fall NEG PAST it seems

 (Subject) (Modality of (Modality of (Modality of 

acknowledgement: tense) truth: 

affirmative/negative) necessity and 

possibility) 

(It seems that it did not rain.) 

Until very recently, the modality of the sentence ending had not 
received sufficient attention, while only the explicit lexical meanings of 
modal words were investigated independently on a word-by-word basis. 
The function of the sentence-final particles such as ne, yo, no, wa, sa, 
ze, and zo is one of popular issues of discourse pragmatics (e.g. Tokieda, 
1951; Saji, 1956; Kitagawa, 1984). The study of the sentence-ending 
particles is a genuine discourse issue because sentence level grammar 

74 


does not require them, and accordingly, functions of sentence-final 
particles were not emphasized in Japanese-as-a-foreign-language 
classrooms until recently. According to Maynard's historical review 
(1992), traditionally, sentence-ending particles (shuu-joshi in 
Japanese), which only appear in speech with distinctive addressees, 
have been considered to be somewhat "interactional" since at least 
Tokieda (1951). Tokieda claimed that ne particle is used to the request 
hearer's sympathy and zo and yo function to force the speaker's view 
onto the hearer. Uyeno (1971) classified these particles into two 
categories: (1) those which express the speaker's insistence on forcing 
the proposition on the hearer (yo, wa, zo, ze, sa); and (2) those which 
express a request for compliance with the proposition but leave the 
option of confirmation to the hearer (ne, nee, na, naa). Kitagawa (1984) 
and Watanabe (1968) considered ne to indicate that the proposition of 
the sentence is related to the addressee. McGloin (1990) distinguished 
three types of functions of sentence-final particles: (1) zo, ze, sa, and yo 
function to "impart information which belongs to the speaker's sphere 
to an addressee"; (2) ne and na are "used to seek confirmation from the 
hearer"; and (3) ne, na, wa, and no function to create "rapport" (p. 36). 

All these researchers share an almost identical perspective on these 
particles. 
In summarizing the existing views and focusing on 

interpersonal aspects of the particles, Maynard (1992) called sentence-
final yo and ne "interactional particles" and indicated that they were 

75 


also "discourse modality indicators" in focusing on different aspects of 
discourse modality: yo focuses on the informational aspect of the 
proposition, and ne focuses on interpersonal aspect in soliciting 
confirmation and emotional support. Since these sentence-final 
particles may involve the speaker's judgement of his hearer's 
knowledge (ne, na) and/or judgment of the necessity and possibility of 
the proposition (yo, sa, etc.), it is predictable that these sentence-final 
particles share important rules in Japanese evidentiality (cf. chapter 
four and five for details). In particular, the pragmatic function of the 
particle ne has been drawing attention since Kamio (1979) as an 
important modal in discourse. 

SENTENCE-ENDING FORMS AND THE SPEAKER'S TERRITORY OF 
INFORMATION 

When there was no general concept of the sentence-ending 
mood, the Japanese psychologist Akio Kamio (1979, 1985, 1987, 1990, 
1994) proposed an insightful theory that a speaker, using sentence-
final forms, linguistically marks the information territory to which his 
proposition belongs. Kamio applied the theory to English and Japanese 
and discussed the differences between the two languages in the 
speaker's concept of information territory. Regardless of the 
practicality of his view in modeling reality, Kamio's model offered a new 
perspective to the field of discourse pragmatics. As has been 
recognized, other than sentence-level grammar, there is a wide range 

76 


of uses of language that a person may need to have knowledge of and 
skill in performing to be considered a competent speaker of that 
language (e.g. Hymes,1979; Halliday, 1979). Through teaching Japanese, 
I have often felt that the appropriate usage of the sentence-final 
modality markings is one of the biggest issues for learners in becoming 
competent speakers of the language. Although this aspect of Japanese is 
not part of the language's grammar, it is an important pragmatic 
requirement of discourse (i.e., discourse grammar) which even native 
speaker language teachers would have a difficult time describing 
systematically. 

In Japanese, researchers have put some thought into the concept 
of discourse grammar. For example, Kuno (1978) attempted to formulate 
the rules of ellipsis and syntactical phenomena in discourse, and Inoue 
(1983) discussed Japanese particles wa and ga as markers of new/old 
information in a given discourse. The theory of territory of 
information by Kamio was, however, the first to discuss sentence-
ending modalities as pragmatic rules of spoken discourse. 

Kamio's framework can be interpreted in such a way that most 
Japanese sentences in discourse must have the "right" kind of modality 
in the sentence ending if they are concerned with the hearer. The 
theory's major concern is the relation of the sentence-ending forms 
and the speaker's psychological concept of territory. Since epistemic 
markers usually reside in sentence modality (e.g. Palmer 1986; Willet 
1988), and sentence modality is often found in the sentence-ending in 

77 


Japanese (e.g. Nitta & Masuoka, 1989), I consider Kamio's information 
territory theory to be also a theory of epistemic evidentiality. 

Kamio paid attention to the sentence-ending forms at the 
discourse level instead of at the sentence level. For example, the 
following Japanese sentences in direct forms are perfectly grammatical 
at the sentence level, but may sound inappropriate at the discourse 
level. The following sentences are all in direct ending forms: 

(3-4) 
O.J. wa muzai ni nat-ta.

 O.J. TOP innocent to become -PAST. 
(O.J. was found innocent by the jury.) 
(3-5) 
watashi wa anata ga suki desu. 
I TOP you ACC like COP(FOR) (I love you.) 

(3-6) Kyoo wa ii tenki desu. 
Today TOP nice weather COP(FOR) (It's a fine day.) 

The Japanese sentences above, which we teach in Japanese-as-aforeign-language class, are grammatical as they are. However, when 
used in actual communication, each of them sounds fairly "declarative" 
as they disregard the hearer's existing knowledge about the proposition. 
Or these sentences may sound "careless" about the hearer's possible 
disagreement with the proposition. Therefore, these utterances are 
often considered to be too assertive at the discourse level in many 
speech situations. Sometimes an assertive declarative sentence works 
well to serve the speaker's purpose; for example, sentence (3-5) is often 
used by a speaker who wants to confess his "one-sided" love to his target 

78 


who is not aware of the speaker's secret feeling. 

The following utterances (3-4'), (3-5'), and (3-6') encoded 
consciounesss of, or attempt to involve, the hearer's knowledge about 
the proposition by attaching modality markers at the end of the 
sentence: 

(3-4') O.J. wa muzai ni natta sooda ne. 

O.J. 
TOP innocent became heard PART(RAPP) 
(I heard that O.J. was found innocent by the jury.) 
(3-5')watashi wa anata ga sukina n-desu. 
I TOP you ACC like-n COP(FOR) 
(I love you, please understand/as you might know.) 

(3-6'
) 
Kyoo wa ii tenki desu ne. 
Today TOP nice weather COP(FOR) PART(SHAR) 
(It's a fine day, as we both know) 

In sentence (3-4'), auxiliary sooda indicates that the information 
is second-hand. Sooda (I heard) shows the speaker's consciousness of 
distance between himself and his proposition. English translations for 
(3-5') and (3-6') are almost the same as the corresponding ones for (3-5) 
and (3-6), while the pragmatic Japanese meanings are different. The 
nominalizer -n (or no) in (3-5') is said to mark the speaker's intention to 
explain, to persuade, to convince, or to give background information or 
new information as if it is already known to the hearer (e.g. McGloin, 
1980: 144), and ne in (3-6') as well as (3-4') is said to indicate the 
speaker's awareness that the information is shared with the hearer (e.g. 
McGloin, 1990, Maynard, 1993, Kamio, 1979-1994, Takubo & Kinsui, 19901992). Therefore, in sentences (3-4') to (3-6'), modification to reduce 

79 


assertiveness is made through the sentence-ending forms. 

In relation with sentence-final modalities, Kamio proposed that 
there are two fundamental conceptual information territories: the 
speaker's and the hearer's territories of information. His early theory 
had only four types of information: the factors of "inside/outside of the 
speaker's territory" and "inside/outside of the hearer's territory" make 

two by two matrix resulting in four different types. And each 
information category was assigned with a single surface sentence-
ending form: 

[3-7] Kamio's original concept of four information territories for a 
speaker 

Inside the hearer's 
territory 
Outside the hearer's 
territory 
Inside the speaker's 
territory 
TERRITORY A 
(information belongs 
to both speaker's and 
hearer's territories) 
direct+ne form 
TERRITORY B 
(information belongs 
only to the speaker's 
territory) 
direct form 
Outside the 
speaker's territory 
TERRITORY C 
(information belongs 
only to the hearer's 
territory) 
indirect+ne form 
TERRITORY D 
(information is out of 
both speaker and 
hearer's territories) 
indirect form 
This earlier framework of Kamio is relevant to a well-known 
psychological concept, the "Johari Window", developed by psychologists 
Joe Lust and Harry Ingham (e.g. Goffman, 1968). The Johari Window is 

80 


"a flat-pack, conceptual model for describing, evaluating, and 
predicting aspects of interpersonal communication" (Jarvis, 1996). This 
idea describes four different ways of how you are seen by others and 
how you see yourself, which demonstrates patterns of how people 
communicate with the outside world. This psychological view of human 
communication style assumes four different windows of the human 
mind which are classified by two sets of contrastive factors: "self" vs. 
"others", and "known" vs. "unknown": 

[3-8] Johari Window 
WINDOW SELF OTHERS DESCRIPTION of 
PANE KNOWLEDGE 

#1 known known public 
#2 known unknown hidden from others 
#3 unknown known blind to self 
#4 unknown unknown unconscious 

OTHERS

 known unknown

 #1 #2 known 

SELF

 #3 #4 unknown 

This concept suggests that an individual views himself as well as 

others through one of these panes in each social interaction. Although 

81 


the concept deals with the self-image of an individual, the foundation of 
the Johari Window is equivalent to Kamio's concept in that information 
(for Kamio) or self-image (for the Johari concept) is viewed with in 
relation with how it is known or perceived by himself and other people, 
thus the concept of information territory is also a psychological issue. 

Later Kamio (1994) revised the theory and argued that 
information has a relative and gradable character, so sometimes it falls 
completely in the territories of both sides and sometimes it falls more in 
one side than in the other. Based on this idea, Kamio assumed six 
different "cases" of interaction of the speaker's and hearer's 
information territories, in which most of our daily utterances fall. 
Kamio said that the sentence-ending form of each utterance reflects the 
types of interaction of the information territories to which utterances 
belong as shown in [3-9]: 
[3-9] 
Cases of interaction of the speaker's and Sentence-ending forms the 
hearer's information territories in Japanese discourse 

(A) The speaker's territory only (e.g. I have a headache.)------------direct form 
(B) Both Speaker's and Hearer's territories 
(and information is completely shared) 
(e.g. It's a beautiful day.) ---------------direct+ne form 
(BC) Both Speaker's and Hearer's territories 
(but the speaker considers the information to fall more within his own 
territory than in the hearer's territory.) 

(e.g. My sister is pretty, isn't she?) -------------daroo(deshoo) form 
82 


(CB) Both Speaker's and Hearer's territories 
(but the speaker considers the information to fall more deeply within the 
hearer's territory than in the speaker's territory.) 

(e.g. 
You are Mr. Yamada, aren't you?)--------------daroo(deshoo) form 
janai form 
(C) The hearer's territory only 
(e.g. It looks like you are feeling sick, aren't you?)-------indirect+ne form 
(D) Neither the speaker's nor the hearer's territory 
(e.g. It seems that it will be fine tomorrow.)-----------------indirect form 
Japanese sentences which correspond to the above English 
sentences are shown below: 
[3-10] 

(A) 
watashi, atama ga itai. 
I head NOM aches. (direct) (I have a headache.) 
(B) 
ii tenki desu ne. 
fine weather COP(FOR) PART(CONF) 
(It's a nice weather as we both know.) 

(BC) Uchi no imooto, kirei daro. 
my POSS younger sister pretty AUX(tag-question)

 (My sister is pretty, isn't she?) 
(CB) Yamada-san deshoo? 
Mr. Yamada AUX(confirmation)

 (You are Mr. Yamada, am I right?) 

(C) 
kibun ga warui mitai desu ne. 
feeling NOM bad appear COP(FOR) PART(CONF) 
(You seem to be feeling sick, aren't you?.) 

(D) 
Ashita wa hareru daroo. 
tomorrow CNT get fair AUX (conjecture) 
(It will probably be fine tomorrow.) 

(The sentences are selected from Kamio, 1994: 87-98 and presented with 
minor modifications) 

83 


Kamio argued that a speaker unconsciously uses the distinctive 
sentence-ending forms described above depending upon the "case" type 
to which his proposition belongs. Perhaps, however, the framework 
represented in [3-9] and [3-10] is too simplistic: a speaker's awareness of 
each of the six cases of territory interaction is simply connected with 
the use of a single surface linguistic form to represent each case of 
territory interaction. Actually, I have found in the data additional 
linguistic forms used related specifically to each case; therefore, 
further analysis on the use of all the possible sentence-ending forms, 
and on how those forms can be integrated into the whole system of 
information territory is necessary to complete Kamio's framework. 

Kamio's theory explains the phenomenon that the usage of direct 
sentence forms in Japanese is pragmatically limited at the discourse 
level. According to Kamio's model, only information which belongs to 

(A) type information of [3-9] and [3-10] (the speaker's territory only) is 
legitimately expressed in direct forms. Kamio identified three groups of 
information resources which are relevant to the notion of the speaker's 
territory of information as described in [3-11] below: 
[3-11] 
(a) 
information obtained through the speaker's direct experience; 
(b) information about persons, facts, and things close to the speaker, 
including information about the speaker's plans, actions, and 
behavior, places to which the speaker has a geographical relation; 
and 
(c) 
information embodying detailed knowledge which falls within the 
speaker's professional or other expertise. 2 
84 


The theory suggests that a speaker is considered to have "sociallylicensed" privileged access to information which belongs to classes (a), 

(b), or (c) of [3-11] (at least in the Japanese communities). Factor (b) 
(i.e., a speaker is entitled to consider the information about persons, 
facts, and things "close" to the speaker as his own territory 
information) presents an outstanding aspect of the Japanese 
sociolinguistic norm. For a Japanese speaker, information about other 
people in his uchi (inside) group (e.g. family matters) is in his own 
territory although Kamio did not emphasize the sociolinguistic 
meanings of the (b) factor. The (a) factor is universally acknowledged 
direct evidentials. Factor (c) is also understandable. The pragmatic 
restriction placed by (a), (b), and (c) to the direct sentence-ending 
results in a large proportion of Japanese sentences being produced with 
indirect modality which belongs to territorial interaction types (B), 
(BC), (CB), and (D) of [3-9] and [3-10] i.e., the indirect form territories. 
For type (B) information, the speaker needs to use the direct form plus 
particle ne . It is another direct territory of the speaker, but since the 
information is shared by the hearer, the particle of information 
sharing, ne, should be added. For (BC) type propositions, since the 
speaker is asking for compliance of the hearer to the proposition of his 
own information territory (which is also shared by the hearer to some 
extent), the auxiliary of compliance-getting, daroo (deshoo in polite 

form) should be used.3 Type (CB) propositions should end with auxiliary 

85 


daroo or negative question form janai since the proposition falls more 
into the hearer's territory, and the speaker is asking for agreement to 
what he believes is shared with the hearer. Only the hearer is supposed 
to have access to (C) type propositions, so the speaker must make sure to 
express that he is out of his territory by using an indirect form to utter 
the proposition. Particle ne is also obligatory with (C) type information 
as in (B) since the proposition falls deeply within the hearer's territory 
and the speaker asks for the hearer's assent. (D) type information does 
not fall in either the speaker's or the hearer's information domain, 
therefore, should be expressed exclusively in the indirect form. In this 
case, "optional ne " can be added. Optional ne is different from the 
"obligatory ne" of cases (B) and (C) in that optional ne functions to send 
"rapport" (e.g. McGloin, 1990) while obligatory ne asks for the hearer's 
assent or compliance (see chapter four for the analysis of ne). 

Kamio's interest was in the functional analysis of the Japanese 
language, so he did not literally emphasize the sociolinguistic and 
pragmatic aspects obviously involved in his model. I believe that 
Kamio's model may contribute to the studies of sociolinguistic and 
pragmatic analysis of the Japanese language in the following three 
major aspects: 
[3-12] 

(a) The theory presented the domain of sociolinguistic territory of the 
Japanese concept of "close" information to the speaker, and 
accordingly provided a reason why indirect mood is dominant in 
Japanese spoken discourse. 
86 


(b) It 
suggested that the use of Japanese evidentiality of given 
information is "relative" to the hearer's knowledge within a given 
discourse. 
(c) In 
accordance with (b), the theory characterized the pragmatic 
function of the final forms, e.g., particle ne, auxiliary daroo 
(deshoo), direct, and indirect forms as the sentence-final mood 
indicators. This concept is remarkable in contrast with the 
traditional approach from sentence grammar. 
Although Kamio's theory deals with human psychological 
territory of information which potentially involves some sociolinguistic 
aspects, attention was not paid to contextual variables of discourse 
which are possibly influential to the model of information territories. 
Therefore, discourse variables such as nature of participants, speech 
settings, were extensively emphasized in this study in order to locate the 
sociolinguistic aspects of the Japanese evidentiality system. Japanese 
evidentials are not only based on the ways that information is obtained 
as universal rules of evidentiality define (i.e. direct 
evidence/experience vs. indirect evidence/experience). It seems that 
the Japanese system is also based on the speaker's awareness of his 
hearer's knowledge. As noted earlier in chapter two, languages such as 
Kogi and Nambiquar share the same kind of hearer-conscious concept 
of evidentiality with the Japanese language (and Kogi and Nambiquar's 
systems are grammaticalized). So the phenomenon of "psychological 
territory for information" is not unique to Japanese. 

In fact, the phenomenon is not limited to a small number of 
languages: we find similar concepts in English too. Labov and Fanshel 

87 


(1977) analyzed "therapeutic interviews" between mental patients and 
their psychotherapists. In doing so, they categorized the initiation from 
the psychotherapist into five event categories which are A-, B-, AB-, O-, 
and D- events. This classification of statements according to the shared 
knowledge involved was done for the purpose of anticipating the 
"syntagmatic" structure of responses from the patients, therefore, the 
authors' interest was in the characteristics of responses to each event 
category, and is irrelevant to this study. However, the authors' method 
of categorizing therapeutic speech from the viewpoint of information 
territory is useful. Their categorization of the therapist's speech events 
follows in [3-13]: 
[3-13] 
A-event: events to which the speaker (A) has privileged access. 
B-event: events about which the hearer (B) has privileged 

knowledge. 
AB-event: knowledge which is shared by A and B. 
O-event: events which are known to everyone present and known to 

be known. 
D-event: events which are known to be disputable. 

The authors said that "these classifications refer to social facts--
that is, generally agreed upon categorizations shared by all those 
present" (p. 100). Stubbs (1983) evaluated their study and explained the 
concept of event-classification as follows: 

A-events are events to which the speaker has privileged access, 
and about which he cannot reasonably be contradicted, since 

88 


they typically concern A's own emotions, experience, personal 
biography, and so on. Examples include I'm cold and I don't 
know. Notice how, in school classrooms, a statement such as 
don' t know may be the only one to which a pupil is not open to 
correction. B-events are, similarly, events about which the 
hearer has privileged knowledge. A cannot therefore normally 
make unmitigated statements about B-events, such as you're cold, 
unless A is in authority over B, for example, as mother to child. 
Statements about B-events would normally be modalized or 
modified: You must be cold or You look cold. (118-119) 

Labov also uses three other related terms. AB-events are defined 
as knowledge which is shared by A and B, and known by both to 
be shared..........O-events are known to everyone present, and 
known to be known. D-events are known to be disputable. There 
is therefore a classification of utterances according to the 
amount of shared knowledge involved. These definitions of AB-
and O-events are comparable to the way in which the term 
pragmatic presuppositions is often defined, as propositions 
which are established by the preceding discourse, or which can 
be assumed to be generally agreed. (119) 

As to A-events and B-events, Labov's and Kamio's views are 

almost identical in that "A-events are those that typically concern A's 

emotions, his daily experience in other contexts, elements in his past 

biography, and so on" (1977:100). Accordingly, Labov and Fanshel 

stipulated the "Rule of Confirmation" for a response to be coherent to 

the discourse that "if A makes a statement about B-events, then it is 

heard as a request for confirmation." 

Responses to assertions are heavily determined by the relation of 
the proposition being asserted to knowledge shared by the 
participants. If A asserts an A-event, he normally requires only 
an acknowledgement of a minimal kind: he often uses such 
assertions to introduce a narrative; B simply must show that he is 
prepared to pay attention during an extended turn at talk. In the 
special case that A makes an assertion about a B-event, his 
utterance is heard as a request for confirmation. Assertions 
about AB- or O-events come closest to the concept of remarks: 
utterances that make minimal demands for response. (101) 

89 


Therefore, Labov and Fanshel paid attention to the hearer's 
responsibility, in English communication, to understand the event 
category to which the speaker's proposition belongs (through both 
context and structure, perhaps) and to correctly reply as expected. In 
my observation, in Japanese communication, the speaker is responsible 
for indicating the category of the proposition properly through 
sentence-ending forms and a reasonably polite hearer respects a 
reasonably polite speaker's decision on sentence-ending forms. If the 
speaker used a direct evidential for a given piece of information, the 
listener accepts that the proposition belongs to the speaker's territory 
and will use indirect forms to talk about it himself; thus, if the hearer 
does not agree, when he talks he might need to show where he considers 
the propositional information belongs. 

Labov and Fanshel acknowledged O-events and D-events as two 
distinctive categories. They said that "the clearest interactional 
consequences follow when A makes an assertion about a D-event...If A 
makes an assertion about a D-event, it is heard as a request for B to give 
an evaluation of that assertion" (the "Rule of Disputable Assertions" of 
discourse coherence). (p. 101) In their view, it seems, whether the 
event is thought to be known or disputable makes a difference in 
English speakers' acceptance of what is heard. 

We can raise some issues with their analysis. First, the border 
between O-events and D-events can be very fuzzy. On this point, the 
authors claimed that one's "pragmatic presupposition" decides whether 

90 


a certain event is O, or AB, or D. A speaker's subjective decision is 
assumed to be in this process. I find this exercise of subjectiveness to be 
a very interesting issue. In a given culture, how much subjectiveness 
are people allowed to exercise in terms of linguistic expression? The 
social norm of the degree of acceptance of the speaker's subjectivity 
must be different from one culture to another, and from one language to 
another. In my 1994 study, it was found that American informants 
expressed third party information as everybody's events more often 
than Japanese informants did. So I have argued that for Japanese 
speakers, public information remains, true or not, other people's 
information until the end, at least linguistically; and in the Japanese 
speaker's psychology, it seems, both O-events and D-events belong to the 
same territory (i.e., other people's information) and stay there forever. 
Even after the epistemic "necessity" of the proposition is confirmed, this 
information is expressed in indirect forms. Based on this observation, I 
have further argued that American culture is more belief-oriented than 
Japanese culture in that each speaker's belief on the proposition 
influences the linguistic forms of public events in American culture, 
while in Japanese psychology, the border of the information territories 
between "others" and "mine" is not flexible. However, in this research, 
Japanese speaker's behavior with regards to O- and D-events was not 
significantly different from that of English speakers. I attribute this 
discrepancy between the two studies to a significant difference in 
degree of general public familiarity with certain public events at each 

91 


time (cf. chapter five). 

There are opinions that there is no such thing as information 
territory. For example, in criticizing Brown and Levinson's "face" 
concept, Matsumoto (1988) quoted Nakane (1967) and said that the 
Japanese culture is group-oriented so that the concept of individual 
territory is not typical among Japanese people. Matsumoto said, 
correctly I think, that the Japanese language is particularly sensitive to 
social context, especially to one's position in relation to others. But I 
consider that this group-orientation of Japanese society does not 
necessary mean that Japanese people do not have a sense of territory. 
Every human being (probably all animals) has some concept of 
personal territory. Discussing "space" in Japanese behavioral 
psychology in relation to the group-oriented nature of Japanese society, 
Japanese psychologist Kimura (1977: 20-24) referred to the theories of 
world-famous psychologist Levin, German behavioral scientist Lorentz, 
and others. These scholars experimentally investigated the functions of 
human concepts of self "position" and "territory, and required 
psychological energy to move out an individual's territory into other 
people's territories. I believe that Japanese people have a sense of 
personal territory as well as group territory, at least they demonstrate 
this linguistically. The following sentences show the speaker's sense of 
group territory and personal territory respectively: 

(3-14) Uchi no kaisha no jinji-bu, 

92 


my household POSS company POSS personnel dept. 
zenzen dame yo. 
at all bad PART( VOC) 
(My company's personnel dept. is inefficient very much.) 

(3-15) Uchi no okusan warito nonbiri shitete sa. 
my household POSS wife fairly laid-back STAT PART(VOC) 
(My wife is fairly laid-back.) 
In (3-14), the speaker called the company he works at uchi-nokaisha (lit. my household company) and used a declarative form to talk 
about it. In (3-15), talking about his wife, the speaker also used a 
declarative mood. In both utterances, it seems that each of the speakers 
felt that the information was within his territory; group territory in (314) and personal territory in (3-15). 
The overall discourse data indicates that people talked about their 
professional knowledge, their direct experience, their family, home 
town, and other things as information to which they have privileged 
access, (i.e., the knowledge in their territory2) and used direct mode to 
talk about them. Good evidence is the linguistic negotiation of territory 
borders which is often seen in subtle morphological modification by 
conversationalists. If you said to your conversational partner who 
happens to be a linguist that there is a linguist called Noam Chomsky. 
He is coming to Texas to lecture on his political view, your behavior 
would be considered inappropriate in disregarding your partner's 
information territory. But if your conversational partner is a rational 

93 


adult, instead of yelling I know Noam Chomsky!, he might say nicely oh, 
is that what he's talking about this time? In saying so, he shows that the 
person named Noam Chomsky and his affiliated information are within 
his information territory as a linguistic professional. This kind of 
negotiation of territory on the deictic level often happen in Japanese 
since direct and indirect deixis are important evidentials in the 
language. In Japanese, unlike English, third person personal pronouns 
and proper nouns cannot be used by both conversationalists if the 
referent is not known to both of them. Observe the following English 
conversation:

 (3-16) A: I met Dr. Yen yesterday. 

B: Who is Dr. Yen?/he?/that person? 
In (3-16), in English, speaker B can use the proper name, the 
pronoun he or the phrase that person referring for the referent. In 
Japanese, since speaker B does not personally know the referent, Dr. 
Yen, speaker B cannot use the proper name (Dr. Yen) or the pronoun 
he . The following (3-17) and (3-18) are acceptable utterances in 
Japanese which correspond to English (3-16B): 

(3-17) B: Dr. yen -tte dare? 
Dr. Yen QUOT who (Who is the person called Dr. Yen?) 

(3-18) B: Sono hito wa dare? 

94 


that person TOP who (Who is that person?) 

In (3-17) the indirect quotation marker -tte ( or -to iu) (called) 
and in (3-18) the demonstrative sono (that) are used to indicate a 
referent who is out of the speaker's information domain. The following 
(3-19) and (3-20) with the proper noun and the personal pronoun he 
respectively are not grammatical when speaker B does not know the 
referent: 

(3-19) B: *Dr. Yen wa dare? 
Dr. Yen TOP who (Who is Dr. Yen?) 
(*ungrammatical) 
(3-20) B: *Kare wa dare? 
he TOP who (Who is he?) 

Ungrammatical sentences (at the discourse level) such as (3-19) 
and (3-20) are frequently used by learners of the Japanese language, 
even by those of advanced levels, and teachers do not dare to correct 
them because the utterances are grammatical at the sentencial level. 

As a matter of fact, in both English and Japanese, speaker A in (316) should have said from the beginning that "I met a person called Dr. 
Yen, yesterday" if he had known that B did not know Dr. Yen, or if he 
was not sure about B's knowledge: 

(3-21) Kinoo Dr. Yen -tte iu hito ni atta n da. 
yesterday Dr. Yen QUOT person DAT met n COP 

(Yesterday, I met a person called Dr. Yen.) 

95 


Sentence (3-21) is more natural than (3-16)A in most cases in 
both English and Japanese conversation when the speaker knows that 
the hearer does not share the knowledge of the referent. Therefore, it 
is evidently true that in both English and Japanese, the speaker is 
supposed to be conscious of his hearer's knowledge in deciding the 
sentence structure (cf. the use of definite and indefinite articles in 
English). In terms of deixis, Japanese is more "persistent" than English 
in that a Japanese speaker cannot use proper nouns/third person 
pronouns for the referent if the referent is not in his information 
domain. This restriction does not change within a given discourse even 
after the referent is introduced and fully explained by one of the 
discourse participants (e.g. Kuno, 1988, Shibatani, 1990, Takubo and 

Kinsui, 1992).4 

Lacoste (1981) showed some interesting examples of negotiation 
of speech territory between doctors and their patients in French. 
Doctors are positioned higher than their patients since they use their 
professional skills to help patients, but, at the same time, they also 
depend on the patients' description of their physical condition to enable 
them to use those skills. Lacoste found, therefore, that often in medical 
interviews the boundary between "patient's events" and "doctor's 
events" are blurred and fluctuating. Patients used their knowledge of 
their physical condition, and made attempts to linguistically invade 
doctor's territory, while doctors, on the other hand, defended their 

96 


professional territory by brandishing their professional knowledge. 
One example of linguistic territory negotiation on the lexical level is 
shown below: 

(3-22) 
Doctor: (a) Depuis quand avez-vous mal au ventre? 
(How long have you had this pain in your stomach?) 

Patient: (b) J'ai jamais eu mal au ventre, j'ai eu mal a la rate. 

(I've never had a pain in my stomach. I have a pain 

in my spleen.) 

Doctor: (c) 
Ecoutez, la rate vous n'etes pas force de savoir ou 
c'est, vous avez eu mal au ventre. 

(Listen, the spleen, you are not supposed to know 

where that is, you had a pain in the stomach.) 
Patient: (d) J'ai mal la (geste de designation). 

(I have a pain there/designative gesture) 

Doctor: (e) Comment vous appelez ca? C'est le ventre. Vous avez 

mal au ventre. 

(What do you call that? That's the stomach. You 

have a pain in the stomach.) 

Patient: (f) Si vous voulez. (If you say so.) 
(Lacoste, 1981: 172) 
Obviously, the doctor in the above conversation was not happy 
with the patient's use of the word la rate (spleen) as well as patient's 
assertion that he had pain in his spleen. The event belongs to the 
doctor's territory (i.e., professional knowledge). In (3-22c), the doctor's 
utterance vous aves mal au ventre (you have a pain in the stomach) 
sounds too direct in speaking about other people's pain, but is supposed 
to be acceptable due to his profession. 
The next example shows negotiation of territory in Japanese 

97 


through the sentence-ending modality. 

(3-23) 

Child A: ashita okaasan i -nai yo 
tomorrow mother exist -NEG PART(VOC) 
(Tomorrow, our mother will be out.) 

Child B: uso da yo. Iru yo! 
lie COP VOC exist PART(VOC) 
(It's a lie. She will be here). 
Adult C: (talking to A) 

A-chan, okaasan soo i-tte-ta?

 your mother so say STAT PAST 

iru to omo-tta-n da kedo naa.

 exist QUOT think-PAST-n COP but PART(RAPP) 

(Dear A, did your mother say so? I thought she 
would be here, but...) 

In sentence (3-23A) and (3-23B), both children (brothers) used 
direct endings (declarative modality) indicating that the information 
about their mother is within their personal information territory. Child 
A indicated that the information was "his" using the direct modality, 
therefore, Child B also used direct forms to negotiate territory. Both of 
them could have used an indirect sentence such as I thought mother 
would (or wouldn't) be here as Adult C did in (3-23C), but children did 
not prefer this alternative, presumably because they do not want to be 
polite to each other; they are young and their relationship is intimate. 

ANOTHER VIEW OF LISTENER-ORIENTED MODALITY IN JAPANESE 

There have been some criticisms of Kamio's model. Except for 
one researcher who specifically stated that Kamio's model is not 

98 


applicable to the Japanese system of demonstratives (Ono, 1995), those 
rejecting or criticising his model have not presented clear reasons of 
disapproval; generally, the antagonists of the model simply claim that 
the concept of information territory does not seem to be applicable to 

Japanese linguistic phenomena as a whole.5 

There has, however, been another major approach to the 
pragmatic functions of Japanese sentence-ending forms on the 

discourse level. Takubo and Kinsui (Takubo, 1990 and 1992; Kinsui, 1990; 
Takubo and Kinsui, 1990, 1992) proposed a Japanese discourse model 
based on Fauconnier's mental space theory6 as well as discourse marker 
theories by Schiffrin (1987) and others. They named the theory "danwa 
kanri riron" (theory of discourse management). I will call their theory 
"mental space theory" in this chapter. As the name implies, this theory 
attempts to explain Japanese linguistic issues from the viewpoint of the 
speaker's assumption about the hearer's knowledge about the 
proposition expressed. It is true that we usually have some particular 
hearer in mind any time we make an utterance. A speaker needs to take 
the hearer's knowledge into consideration, and choose appropriate 
linguistic forms such as words or sentence structures. When the 
speaker introduces a new issue in discourse, he needs to linguistically 
indicate that the issue is new (e.g. Yesterday, James found a peach in 
our yard.) After the speaker introduces a new issue, which is not shared 
by the hearer, the speaker, before making his next utterance, needs to 

99 


consider how the hearer's knowledge has been changed by the 
information that he has just given to the hearer (e.g. The peach was 
actually a giant peach). This kind of discourse managing behavior based 
on the hearer's assumed knowledge is normally seen in every language, 
but how to do so must vary across languages. 

The theory of discourse management assumes that mental space 
is a discourse management system. Mental space is considered to be a 
layered database, and each utterance in conversation is a kind of 
command to use the database to register, search, infer, and so forth. The 
authors claimed that in Japanese, mental space is divided into two areas: 
a "direct experience area" and an "indirect experience area". The direct 
area involves long-term memory, episodic memory acquired through 
direct experience, and knowledge that is obtained from the on-going 
conversation. The indirect area contains information that is obtained 
linguistically (i.e., reading or hearing as indirect experience). In the 
mental space theory, the hearer's assumed knowledge is speculated to be 
in the indirect memory area of the speaker. In short, Takubo and Kinsui 
suggested that we have three interacting areas of memory: the direct 
information field (for directly obtained knowledge), the indirect 
information field (indirectly obtained knowledge), and the hearer's 
knowledge field within the speaker's indirect information field (since it 
is only assumed by the speaker as his indirect experience). Their 
theory assumes that the sentence-ending modality and other modals are 
the speaker's "message" to the hearer or the speaker himself to organize 

100 


memories in different memory areas. As Fauconnier hypothesized that 
the same information exists in multiple mental spaces and is described 
differently linguistically, Takubo and Kinsui assumed that the same 
information can exist in different connected memory spaces. They 
attempted to explain nouns/third person pronouns, sentential-final 
particles, and demonstratives in order to indicate how the speaker 

interacts with the same information in different memory spaces.7 

I do not consider Takubo and Kinsui's approach to be 
significantly different in effect from Kamio's model at least on the issue 
of the relationship between the sentential ending forms and proposition 
types. As Kamio had, Takubo and Kinsui paid attention to hearer-
sensitivity of Japanese sentence-final forms and explained the function 
of the forms. Takubo and Kinsui used the concept of memory space of 
the speaker and the hearer, while Kamio used the concept of 
information territories of speaker and the hearer as [3-24] shows below. 
In both models, forms of sentential modality are related with the types 
of information. 

Both theories assume four similar basic categories of 
evidentiality types. The difference between the two model is that in 
Kamio's model sentence-ending forms and information domain are 

simply connected, while the mental space theory viewed particular 
words (including the sentence-ending forms) to show distinctive 
"signs" or "commands" presented by the speaker in organizing 

101



information in memory space of both himself and his hearer. For 
example, Takubo and Kinsui (1992) claimed that the Japanese sentencial 
final particle ne expresses the speaker's "command" for confirmation if 
information exists in two places (his memory and hearer's memory). 

[3-24] Information territory theory vs. mental space theory 

Type of events Information 
territory theory 
Mental 
theory 
space Evidentiality 
direct 
information 
the speaker 
for 
In the speaker's 
territory (A) 
(direct ending) 
In the speaker's 
direct memory 
space (a) 
direct 
(speaker's 
evidence) 
indirect 
information 
the speaker 
for 
In the other 
people's 
territory(D) 
In the speaker's 
indirect memory 
space(b) 
indirect 
(indirect ending) 
direct 
information 
the hearer 
for 
In the hearer's 
territory (C) 
(indirect + ne 
ending) 
In the hearer's 
memory space in 
the speaker's 
indirect memory 
space (c) 
indirect 
(hearer's 
evidence) 
shared 
information for 
the speaker and 
the hearer 
In the shared 
territory(B, CB, BC) 
(daroo, ne- related 
endings) 
(a) or (b) 
and 
(c) 
direct 
(shared) 

(Note: A, B, CB, BC and D are from [3-9, 3-10] in this chapter.)

 (3-25) 

kimino tanjoobi wa san-gatsu desu-ne. 

your birthday TOP March COP(FOR) PART(CONF) 

102



(Your birthday is March, we have the same information, don't we?) 

In (3-25), the proposition is the hearer's matter but the speaker 
knows it too, so the speaker confirmed the existence of the same piece 
of information in two places of his own memory--the speaker's indirect 
memory area and assumed hearer's memory area in the speaker's 
indirect memory area--by saying (3-25), where ne is the "sign" of this 
"memory-matching" action. Takubo and Kinsui characterized the final 
particle yo as a speaker's command to the hearer to write information 
in the indirect memory. That function perhaps can be phrased as 
"speaker's declaration of some speaker's matter" which the hearer does 

not have knowledge. 
(3-26) A: ogenki desu-ka? 
well/active COP(FOR) Q (How are you?) 
B: watashi wa moo 70 desu-YO. 
I TOP already 70 COP(FOR) VOC 

(I am already 70 years-old, now you know I must not be very well.) 
(Takubo, 1992: 23) 
In above conversation, the surface meaning of B's answer (i.e., 
am already seventy) is not straightforwardly relevant to A's question, 
therefore, considered to be a case of an "implicature" in Grice's concept. 
In English, the hearer is required to contextually analyze the implicated 
meaning (or to find out whether it is an implicature or "blatant" failure 
to fulfill a maxim), while in Japanese the final forms such as particles 
help to suggest the existence of implication expressed by the speaker as 
indicated in [3-26]. In a sense, this phenomenon implies the importance 

103 


of final forms in the Japanese pragmatics from the viewpoint of 
Cooperative Principles. 

In Kamio's theory, sentence (3-26) B is simply within the 
speaker's own information territory so that the direct form desu is 
acceptable, and particle yo is optional. Actually, ne is the only sentence-
final particle that matters in Kamio's model. This is reasonable since, 
among particles, only ne (and possibly na) seems to function to indicate 
the shared knowledge (e.g. McGloin, 1990, Ueno, 1971). In the same way, 
the mental space theory defined particle yone as a sign to confirm the 
sameness of the information which has just been written in the 
speaker's indirect memory area and the information which already 
exists in the hearer's memory area. 

So far, Kamio's model and Takubo and Kinsui's model do not 
appear significantly different from each other with regard to the 
function of the sentence-ending forms; they merely have different 
viewpoints. However, the difference appears in the analysis of ending 
form daroo (deshoo), demonstratives, and other noun phrases. As is 
noted in chapter two, the Japanese auxiliary daroo is traditionally said 
to have two distinctive meanings: one is conjecture (probably) and the 
other is confirmation (tag-question isn't it? etc.) as in the following 
examples: 

(3-27)Uchi no imooto, kirei daro. 
my POSS younger sister pretty AUX(confirmation) 
(My sister is pretty, isn't she?) 

104



(3-28) Ano hito, ko-nai daroo to omotte -ta. 
that person come-NEG AUX QUOT I think(STAT)-PAST 
(conjecture) 
(I expected that person would probably not come.) 
Sentence (3-27) shows "confirmation daroo" and (3-28) shows 
"conjecture daroo". In the mental space theory, since the proposition of 
a speaker's conjecture is not supported by direct evidence, it should be 
written in his indirect memory area. Therefore, conjecture daroo is a 
sign that a proposition is to be written in the speaker's indirect memory 
while confirmation daroo is the speaker's sign (or command) to the 
hearer to write information into the hearer's direct memory area since 
the information which needs to be confirmed is naturally shared by the 
hearer. The theory specifies that the hearer's information area resides 
in the speaker's indirect memory area; therefore, in the mental space 
theory, the auxiliary daroo (both "conjecture" and "confirmation") is 
characterized as a sign that the speaker inputs his information into his 
indirect memory space. By doing this, the theory puts the function of 
the two types of daroo together. I believe that this view is also 
insightful. 
The mental space theory seems to be more expandable to other 
areas of linguistics, but how far it can be applied is not yet known. One 
problem with Takubo and Kinsui's mental space theory is that only 
information obtained by direct experience or long-term memory that is 
stored in direct memory space can be linguistically described in direct 
forms. This premise of their theory does not meet actual Japanese usage 

105 


of direct/indirect language forms. In reality, as Kamio clarified, 
Japanese speakers use direct forms to describe the information which 
they did not obtain through direct experience but to which they feel 
they are socially entitled to claim intimacy. The theory of territory of 
information explains the Japanese concept of direct information well. 

Also, some phenomena in Japanese that do not conform to the 
universal evidentiality rules are easy to understand in the framework of 
information territory. In (3-29), speaker A provided an episode 
concerning Princess Masako. She made her statement in direct form 
which caught the attention of her hearers. 

(3-29) F2: Masako-san, kekkon suru mae ni 
Princess Masako marriage get before TEMP 

esute janai kedo, nannka kayotte -ta -no yo. 
aesthetic NEG but something go (STAT)-(PAST)-n VOC

 (Princess Masako frequently went to somewhere like 
aesthetic salon before she got married, I am telling you.)

 Others: sugoooi .. ..johoo ga
extravagant information NOM 


(What an information source you have!) 

This is an example in which a speaker evidentially claims that a 
given piece of information is in her territory although it is not 
supposed to be. This violation of territory rules was intentionally made 
by the speaker who proudly announced that she watches almost all midday TV talk shows and became very "resourceful" about popular gossip. 

106 


Violation of territory rules also occurs in the opposite way. In the 
following (3-30), by using the indirect auxiliary mitai (it seems), 
speaker B appears to have reserved her right to claim the ownership of 
her information:

 (3-28)A: Go-shujin no kaisha doo? 
Your husband POSS company how 
B: Chotto dame mitai. Raigetsu heisasuru-koto ni 
no-good it seems Next month close COM DAT 
kimatta -tte. Shujin ga kinoo itteta wa. 
decided QUOT My husband NOM yesterday said STAT RAPP

 A: 
How is your husband's company doing? 
B: 
It seems that it is not doing well. I heard they 
decided to close the company next month. My 
husband told me yesterday. 
In (3-30), the speaker, in talking about her husband's business 
that is closely related with her life, used an indirect form mitai (seem). 
Her intention can be understood to be modest in respecting her husband 
information territory. These phenomena of the "assertion of 
information ownership" (i.e., non-use of socially required indirect 
forms) as in (3-29) and "speaker's intentional neglect of information 
ownership" (i.e., non-use of socially approved direct forms) as in (3-30) 
can be well explained under the assumption of existing information 
territories. 

In light of these observations, it seems reasonable to hypothesize 
psychological information territories which a speaker perceives in 
interactional spoken discourse. The concept of territory may be only a 

107 


surface view of Japanese modality but it is very useful to systematize the 
use of sentence-ending evidentiality. 

108



CHAPTER 3: NOTES 

1 However, it is true that Japanese speakers do not often 
"explain" the details of their contention under the assumption that the 
hearer knows what the speaker is talking about. Thus, an extensive 
explanation of a topic tends to be considered impolite. This behavior is 
problematic because it often results in mis-communication. This 
cultural issue is discussed in chapter seven in relation to the Japanese 

background of evidentiality markings. 
A grammatical aspect of Japanese which emphasizes the 
speaker's delicate concern with the listener is called the "empathy" 

phenomenon in Japanese grammar. It involves the speaker and listener 
relationship as an important aspect of, for example, syntax. 

Kuno (e.g. 1976, 1978, 1987) drew academic attention to "speakerempathetic" phenomena in Japanese grammar. He defined "empathy" as 
"the speaker's identification, which may vary in degree, with the 
person/thing that participates in the event or state that he describes in 
a sentence" (1987:206). Actually, such phenomena are not limited to 
only Japanese. As an example, Kuno cited the following English 
sentences which describe a situation where John hit his brother Bill: 

[3-31] John hit Bill. 

John hit his brother. 

Bill's brother hit him. 

Bill was hit by John 

Bill was hit by his brother.

 ?? John's brother was hit by John.

 * His brother was hit by John. 
The last two sentences are syntactically grammatical but their 
acceptability is lower than the others due to the discrepancy between 
the speaker's empathy and the sentential subject: Kuno argued that the 
structural subject legitimately receives the highest focus of the 
speaker's empathy but the phrases "John's brother" and "his brother" 

109 


are not "empathetic" from the speaker's perspective. Kuno gives five 
different hierarchies which interact with each other to produce 
different degrees of acceptability. The following is the summary of his 
empathy hierarchies: 

[3-32] 

The Speech Act Empathy Hierarchy: the speaker must empathize with 
himself rather than any other person or object; 

The Topic Empathy Hierarchy: the speaker must empathize with a 
discourse topic rather than a non-topic; 

The Descriptor Empathy Hierarchy: between given two descriptors (e.g. 
'John' and 'John's brother'), the one on which the other descriptor 
depends show the speaker's focus of empathy; 

The Surface Structure Empathy Hierarchy: the subject of a sentence is 
the focus of empathy; 

The Word Order Empathy Hierarchy: the left hand NP in a coordinate 
structure is more readily empathized with than the right hand NP. 

According to the theory, there cannot be more than one focus of 
empathy within a given sentence, therefore, if there is a conflict of 
plural numbers of empathy targets, the sentence will not be acceptable 
("Ban of Conflicting Empathy Foci"). This observation might be valid 
across languages. 

Based on his series of empathy theories, Kuno explained certain 
phenomena of Japanese grammar such as the auxiliary use of "giving 
and receiving" verbs, reflexives and empathy adjectives are empathy-
oriented. Kuno's argument emphasized the role of the speaker's 
subjectivity in producing sentences. 

2As introduced in chapter two, Kamio listed and characterized the 
three major categories of the information which belongs to (A)type 
(only speaker's) territory as follows: 

110 


[3-33] 

(1)Information about direct experience: 
Information that is obtained through the speaker's direct 
experience is a central component of information that falls 
within his territory of information. 

(e.g.) Watashi atama ga itai. 
I head NOM ache. (I have a headache.) 

(2) Information about personal data: 
(2a) Personal information: 
Even if a speaker lacks a direct experience, personal information 
such as family matters falls within the speaker's territory. 

(e.g.) Kanai wa 46 desu. 
my wife TOP 46 years' old COP(FOR) 
(My wife is 46 years' old.)

 (2b) Geographical information: 
A subclass of personal information involves those concerned 
with geographical information which is intimate to the speaker. 
The following sentence should be expressed as falling in the 
speaker's territory if the speaker is from Kyoto. 

(e.g.) Kyoto no jinkoo wa 150-man gurai desu yo. 
Kyoto POSS population TOP 1,500,000 about COP(FOR)(VOC) 

(The population of Kyoto is about 1,500,000..)

 (2c)Information about plans, actions, and behavior 
Another subclass of personal information. 

(e.g.) Kore kara Osaka e ikimasu. 
this from Osaka LOC go(FOR) 

(I am going to Osaka now.) 

111 


(3) 
Information about expertise 
(e.g.) Travel agent: 
Pari 
e wa chokkoubin ga benri desu. 

Paris LOC TOP direct flight NOM convenient COP(FOR) 
(To Paris, a direct flight is convenient.) 

(e.g.
) 
Professional demographer: 

Kyoto no jinkoo wa 150-man gurai desu yo. 
Kyoto POSS population TOP 1,500,000 about COP(FOR)(VOC)

 (The population of Kyoto is about 1,500,000.) 

Therefore, in Kamio's model, a direct assertion which falls in the 
speaker's territory is based on not only the speaker's direct experience 
but also knowledge from his profession and personal data. The speaker 
is "socially authorized" to speak about these topics in direct forms. 

3 Pragmatic use of auxiliary daroo (tag-question) was first 
systematically explained by Kinsui (1992) with his mental space theory. 
Kamio's original model (1990) had four territories of information but he 
later revised it into one with six "cases" of interaction of the speaker's 
and the hearer's territories (1994). In Kamio's original model, the 
auxiliary daroo was not involved as an important form of sentence-final 
modality. 

4 Observe the following example of direct/indirect deixis choice 
of Japanese shown in conversation between person A and person B; 

(3-34 ) A:UCLA no 
UCLA POSS 
Akatsuka-tte iu 
PROPER NAME QUOT 
gengogakushalinguist 
ga 
NOM 
kondisyonaru 
conditionals 
to 
and 
episutemorogee 
epistemology 
no 
MODI 
hanashi 
topic 

112



kaiteta naa. 

wroteSTAT I recall 

(A: A linguist whose name is Akatsuka at UCLA wrote an article 
about epistemology and conditionals, I remember.) 
B: Akatsuka wa episutemikku sukeeru no aatikuru ga 
PROPER NAME POSS epistemic scale MODI article NOM 
moo hitotsu at-ta deshoo. 
more one exist-AUX(PAST) AUX (confirmation)

 (B: 
There is another article of Akatsuka's concerning 
epistemologic scale, isn't there?) 
In (3-34), speaker A used the quoted expression Akatsuka-tte-iu 
gengogakusha (a linguist named Akatsuka) implying that A assumed 
that B does not know Akatsuka. If B did not know the referent as A 
assumed, B is suposed to accept the indirect modality of the noun phrase 
for the referent which is assigned to him by speaker A and use it (e.g. 
sono Akatsuka-tte iu hito [that person named Akatsuka]). But, in reality, 
B knew Akatsuka, so speaker B in (3-34) did not use the indirect quoted 
form of the referent, instead she simply used the direct noun form 
Akatsuka. By doing so, speaker B demonstrated that she knows Akatsuka 
well and that Akatsuka is in her speech territory contrary to speaker A's 
assumption, which might have been perceived as being impertinent. In 
Japanese, a speaker is required to use the deictic as it is introduced to the 
discourse by his conversation partner until they find that both parties 
have the same information. I feel that B's act in (3-34B) is nothing but a 
negotiation of personal speech territory, which I perceive aggressive. 
If speaker B had desired to be polite, B should have used the quoted 
indirect expression that A had used, admitted that he knows Akatsuka, 
and then shifted a different referring expression as in (3-35): 

113



(3-35) 

B: 
Aa, sono UCLA no Akatsuka--tte iu gengogakusha 
Oh, that UCLA POSS proper name QUOTcall linguist 
nara shitteru wa.

 COND know PART(RAPP) 

Akatsuka no episutemikku sukeeru wa moo 

PROPER NAME POSS epistemic scale TOP more 

hitotsu aatikuru ga atta desho. 
one article NOM existed doesn't it?


(B: Oh, I know that person called Akatsuka at UCLA. Wasn't there 
another article of Akatsuka concerning epistemologic scale?) 
In (3-35), speaker B replied using the indirect quoted form of the 
proposition (linguist called Akatsuka) as introduced by the 
conversational partner A, not asserting her information territory. By 
context, B in (3-35) indicated the proposition is shared by both sides. 
Since (3-35) B used the indirect modality first, it would be considered to 
be polite by all. Also, speaker A could have been polite in showing that 
he assumed that the proposition was shared by hearer B from the 
beginning by using the direct noun without the quotation markers. In 
this way, the use of deictics presents another important "territory" 
factor in Japanese pragmatics. 

5Whether or not Kamio's concept is applicable to the whole 
system of Japanese pragmatics is not known. That issue is beyond the 
scope of this dissertation. However, Kamio (1990) certainly attempted to 
show that the concept is fairly applicable to wider range of linguistic 
phenomena in both English and Japanese. He attempted to apply the 
theory of information territory to various language structures such as 
sentence structures (e.g. cleft sentence, presuppositional phrases, 

114 


performative sentence, thetic judgement), nouns phrases (e.g. 
anaphors, demonstratives), lexical meanings of some words (e.g. come 
vs. go, this vs. that), and other discourse aspects such as intonation and 
honorifics. 

6Fauconnier's mental space theory: Fauconnier (1985) originated 
a pragmatic theory of semantics named the mental space theory (espace 
mentaux). This theory is useful for evidentiality studies in that it deals 
with the psychological connection between linguistic forms and 
direct/indirect memories. The theory uses basic mathematical concepts 
to solve some problematic semantic issues. Fauconnier argued that the 
central features of language organization depend on their links with 
other cognitively motivated structures, and that linguistic expressions 
contribute to setting up connected mental domains. Fauconnier posited 
that we have multiple mental worlds (or spaces), which are connected 
with each other, and reflect the real world differently. He said that 
"Linguistic expressions will typically establish new spaces or refer back 
to one already introduced in the discourse." (p. 17) He explained that 
linguistic "space-builders" may be prepositional phrases (e.g. in Len's 
mind, in 1929, at the factory), adverbs (e.g. really, probably, 
theoretically), connectives (e.g. if A then B, either A or B), and 
underlying subject-verb combinations (e.g. Max believes, May hopes). 

For example, consider the following sentences. 

(3-36) Susan likes Harry.
(3-37) Max believes that Susan hates Harry.


According to Fauconnier's theory, sentence (3-36) presents space R 
(origin="speaker's reality") in establishing relation between Susan and 
Harry in space R (=Reality). In sentence (3-37), the phrase Max believes 
is a space-builder which establishes space M. The phrase Susan hates 

115 


Harry established relation between Susan and Harry in space M which 
happened to be different from reality. The theory explains that, for 
sentences (3-36) and (3-37), we must assume two mental worlds, and both 
worlds are connected with a function called "connector F", and the 
relationship between Susan (a) and Susan (b) in two worlds is described 
as F(a)=b. This identified relationship means that both girls are the 
same person. 

The theory has relevancy to the study of territory as well as 
evidentiality in that it argues that a speaker expresses linguistically the 
space in his mind his information/knowledge belongs to. 

Fauconnier applied the theory to various linguistic issues: 
anaphoric pronouns, definite descriptions, assumption, conditionals, 
comparative sentences, and others. 

7I speculate that mental space theory is promising in providing a 
"deep structure" of Japanese modality usage, while the territory theory 
provides a sort of "surface account structure". It is true that when we 
talk to somebody we consistently need to refer to our hearer's 
knowledge (what we assume they have) somewhere in our memory and 
linguistically show our understanding of the hearers' changing 
knowledge in on-going discourse. So neurologically, the mental space 
model might reflect the biological behavior of our brain. The 

consequence of this mental behavior, i.e., a speaker's choice of 
evidential and other modality of each utterance, may be seen as 
reflecting the model of territories of information as in Kamio's 
framework on surface. 

116



CHAPTER 4: METHODOLOGY 

Creating a realistic model of the Japanese evidentiality system 
naturally requires a thorough investigation of the actual use of 
Japanese evidentials. This study may be considered sociolinguistic 
quantitative empirical research in that the analysis is genuinely based 
on data collected from informants' natural everyday speech in various 
speech situations. I have examined individuals' linguistic performance 
in my native language and culture. In this sense, I have an advantage 
in understanding the language user's meanings, both surface and 
intended meanings, but at the same time, my perspective may lack 
"objectivity" due to my status as an insider. I tried to be cautious 
regarding this concern, and have sought out third persons' opinions as 
much as possible to ensure that my interpretation of informants' 
meaning is proper. In particular, understanding the speaker's meaning 
encoded in a subtle difference of intonation (sentence-ending tone, for 
example) is a difficult task which may produce disagreement even 
among native speakers. However, the primary judgement of the 
meaning of informants' speech behavior was performed by myself. 

DATA COLLECTION 

Most of the data collection was done in the informants' familiar 
environment with native culture (i.e., Japan, or quasi-Japanesecommunity in the U.S.A.). The data corpus was collected between 1990 
and 1997 but the majority was obtained in 1996. The American sites were 

117 


primarily Madison, Wisconsin, and Austin, Texas, where I engaged in 
M.A. and Ph.D. studies. During this time, my primary interest was in 
discourse analysis; main areas included "tense-alternation", "discourse 
organization", "Represented Speech and Thought (or RST)" (cf. Banfield, 
1982), "speaker's subjectivity and discourse grammar", "common 
cultural understanding for discourse background", and "hearsay 
discourse". In performing research on these interests, I collected a 
variety of spoken discourses (e.g. storytelling, conversational, and 
interviewed discourse). Since I taught Japanese during this period in 
both places as teaching assistant and assistant instructor, I became 
acquainted with a number of Japanese graduate students who were my 
main informants from American sites. Most of them belonged to, more 
or less, the same age group (25 to 35 years old), and speech events were 
generally informal. With the purpose of obtaining more divergent data 
in regard to speech setting, I spent six weeks in Japan (Tokyo area) in 
1996. During this time, I met friends, their families and friends and 
visited their work-places and other social occasions to acquire an 
extensive data collection. More informal data was collected than formal 
data, but I believe that videotaped/audiotaped formal speech events from 
publicly available speech situations (e.g."TV interview program", "news 
report show", and "public talk") supply sufficient formal speech data. 
Informants were from a wide range of age groups: ranging from eight-
years-old to seventies. The following table shows the schematic 
stratification of the informants and quantity and type of speech data I 

118 


actually used for this research. 

[4-1] Number of informants: 
(age) 0-9 10-19 20-29 30-39 40-49 50-59 60-Total 
Male 3 2 86 4528 
Female 3 9 3 11 1 2 29 
Students 17* 20* (37*) 
(* Students' data were not individually analyzed, but were treated 
as group data.) 

Recording hours: 
Audio tapes: approx. 20 hrs 
Video tapes: approx. 5.5 hrs 

Number of speech events: 
Formal group: 14 
Informal group: 11 
Public: 5 
School: 2 
Courtroom: 4 

Number of speech units examined: approx. 10,700 

Number of speech units (i.e., sentences) 
with clear modality and used for analysis: 7,024 

Number of speech unit analyzed: 
Formal group: 1,993 
Friends: 1,904 
Family: 1,462 
Public: 401 
School: 630 
Courtroom: 634 

Informants are numbered M1 through M28 for males, F1 through 

F29 for females, and S1 and S2 for two groups of students. (cf. Appendix 

A). In the above [4-1], the informants are partitioned simply according 

to biological background information, age and sex. My intention was to 

collect a variety of speech events which involve different types and 

degrees of formality created by speech situations including a variety of 

119 


relationships among the speakers. So overall, information concerning 
speakers' relationship, such as power difference, is considered to be 
included in the categorization of speech situations. Speech situations 
are roughly grouped into six types: "formal group conversation", 
"discourse of talking to public", "informal group discourse", "family 
discourse", "teacher and student discourse", and "court discourse". 
Family discourse is, naturally considered to be "informal", but it is 
regarded as an independent group based on the speculation that 
Japanese family members share a strong sense of in-group membership 
and this might affect the rules of evidentiality within the group. 
Therefore, "family discourse" and "informal group discourse" are under 
the overall category of "informal discourse" while "formal group 
conversation", "talking to public", "teacher's discourse", and "courtroom 
discourse" are considered to fall under the category of "formal 
discourse". However each discourse type was analyzed independently 
due to some observed difference in evidentiality phenomena among the 
groups. 

Most informants were well-educated members of the middle 
class.1 The sample is actually a "convenience sample" given the 

constraints of gathering field data from familiar people, so that 
informants are not evenly nor equally stratified. While there may be 
more suitable groups of informants equally distributed among age 
groups, I believe that the given group of speakers suffices for the 

120 


purpose of this research. 

I also believe that the process of data collection was highly 
natural due to my function as a participant in a high proportion of the 
data. It has been suggested that face-to-face interviews are appropriate 
for quantitative research that requires volume and quality of recorded 
speech; however, the "experimental effect" is unavoidable in interviews 

(e.g. Labov, 1984). Fortunately, this was not a serious problem in this 
research since I was, most of the time, a "participant-observer" in group 
settings, although I was sometimes an interviewer in initiating talks. 
With the exception of data collected from public speech, in group 
discussions, there were often more than two speakers besides myself, 
and I was familiar with most of the informants. However, it is still 
undeniable that the act of recording may have caused "recording 
effect", but I noticed that my informants often forgot the existence of 
the tape recorder when in a group of people. Part of the data was 
procured from face-to-face interviews for which the experimental 
effect can be anticipated. 
When doing interviews and also when participating in group 
conversation, when applicable, I used some prepared discourse topics 
for the informants to talk about. The main concern of this research as 
an evidentiality study was to see how informants talk about information 
from different "information sources"; therefore topics were chosen on 
the spot to elicit utterances about information of both direct and 
indirect experience of the speaker. In order to elicit discussion of the 

121 


informants' direct experience, I asked about their work, family, and 
other things, in the past, at present, and in the future, which seemed 
most interesting to them. In order to let them talk about issues which 
are not directly concerned with them, I used social issues of the time. 
Fortunately for me, but unfortunately for the community at large, at 
that time, Japanese society had several serious public issues about which 
people were very well informed: the Aumu-shinrikyoo (Aum-cult) case 

and the Yakugai-AIDS (AIDS blood serum) case.2 

Spoken discourses were tape-recorded with a SONY cassette-
recorder TCM-S67V with microphone. Informants' written permission 
was sought prior to tape recording, and an outline of research purposes 
was briefly explained to each informant. Since the research topic is 
fairly linguistically specific, I believe that most of the participants did 
not pay much attention to my academic interest. I think that their 
nonchalant attitude to the purpose of my recording worked favorably in 
that the speech data were not influenced by the speakers' awareness of 
the purpose of research. 

Data collection was not combined with more comprehensive long-
term studies of overall linguistic performance of the informants since I 
am familiar with the culture of their speech environment. Therefore, 
the data are, more or less, "on-the-spot" data. For some informants, data 
from different speech situations was obtained to see the same speakers' 
variation of language use in response to changing social factors, but a 

122 


large part of the collected speech was treated as "speech chunks" to 
present evidence of linguistic forms (i.e., evidentiality markings) in 
different speech situations. In this sense, the quantitative part of the 
analysis of linguistic forms will appear to be fairly mechanical matter 
of looking for consistency in occurrence of certain linguistic 
phenomena in certain types of social situations. 

However, qualitatively, attention was paid to the nature of the 
speech setting because it was speculated in the research plan that 
Japanese evidential expressions are under the influence of different 
kinds of "hearers", while many evidentiality studies (e.g. Palmer, 1986; 
and Chafe, 1986) suggest that the speaker's experience is the basic and 
major factor that the speaker relies on to employ evidential markers. 
We are all aware that even a short conversation can involve all 
attributes from the speaker, the hearer, and their relationship as well as 
other environmental factors of the speech (e.g. bystanders and 
location). As the target of sociological analysis of evidential forms, the 
hearer's social relationship with the speaker is the issue of analysis, i.e. 
how distant the relationship of the conversationalists is. Naturally, 
speakers have different types of hearers. Hearers can be superior (e.g. 
boss at work) or inferior (e.g. child) to the speaker, or on an equal status 
with the speaker (e.g. friend), and a speaker must have different 
"speech styles" respective to each kind of hearer. Theoretical linguistics 
as well as linguistic pragmatic theories often assume an idealistic 
speech situation with an idealized addressee, but in actuality, each 

123 


speech situation may have different rules of linguistic epistemic coding: 
Perhaps we do not hesitate to say my salary is too low! to somebody 
intimate to us, but certainly we will be less direct to our superiors and 
phrase it as, for example, my salary seems to be lower than one would 
expect judging from reported industry averages. A speaker's 
epistemology level is marked differently by the choice of sentence 
modality. Therefore, sentence modality expressions are also a 
sociological issue of speech environment. In this sense, even though 
this research is not about comprehensive human speech behavior, it 
will be able to show us a subset of Japanese speech behavior in relation 
with social realities through a very small focal point, i.e., linguistic 
forms of evidentiality. 

THE DEFINITIONS 

First of all, formal and informal speech situations need to be 
defined. The primary subject of this study is to determine how 
situational features (e.g. types of occasion, speakers' biological and 
social background, power-relationship between speakers) influence 
speaker's evidential coding in naturally occurring speech in a variety 
of formal and informal speech situations. The speech level is usually 
controlled by the formality factors, in which the speaker's speech style 
varies along a dimension of formality. It has been pointed out that a 
formal occasion calls for polite language use (e.g. Shibatani, 1990; Ide 
1982). The factors that contribute to formality are various: the nature of 

124 


the addressee, the perceived formality of the occasion, the nature of the 
topics of discussion, the nature of the bystanders, and others (e.g. 
Shibatani, 1990). Formal and informal speech situations are often 
defined by the use of linguistic features such as syntactic standardness, 
phonological standardness, morphological fullness, etc. (e.g. Labov, 
1972b; Ervin-Tripp, 1972). However, for convenience, I consider in-
group speech settings to be informal, and out-group settings to be 
formal. 

Discrimination of uchi (in-group) from soto (out-group) is one of 
the fundamental principles of Japanese social interaction together with 
the social concept of vertical hierarchy. Historically, Japanese society 
has been considered to be group-oriented, in which people are 
conscious of their status as a member of their groups. A group can be 
any gathering of people such as colleagues at work, schoolmates, club 
members, family members, couples, siblings, neighbors, and town-
dwellers. People often refer to groups they belong to as uchi. Uchi, 
which is nearly the same as ie, literally means household. A 
businessman may call the company he works for uchi no kaisha which 
literally means my household's company. In the same way, a university 
professor or a university student may refer to his school as uchi no 
daigaku (lit. my household's university). Sociologists such as Pelzel 
(1970), Bachnik (1983), and Nakane (1967) argued that ie is not only a 
kin-based domestic group, but any unit in which social and economic 
life is involved. This concept of "my group is my household", as a matter 

125 


of fact, contributed to the development of the Japanese economy 
through worker devotion to their corporative employers. Interestingly, 
some sociology studies suggest that Japanese people do not have a solid 
sense of nationality (e.g. Sakaiya, 1991). This is probably due to the 
relation with immediate groups being of primary importance. Groups 
can be small or large, and an individual normally belongs to a number 
of groups. Some anthropological studies characterize Japanese people as 
being psychologically comfortable within their groups, and very 
apathetic to groups they do not belong to (e.g. Nakane, 1967; Doi, 1973). 

It can be argued that Japanese people are conscious of group 
territories as well as personal territories, which has the potential to 
influence language use. Usage of Japanese honorifics in the selection 
of verbs, nouns, and grammatical forms is often dependent on the 

relative group membership of the listener, speaker, and referent.3 In 

this research, the types of groups will involve "family", "close friends", 
"work friends", and others for informal settings, and '"TV interview", 
"public talk", "teacher/student interaction", "formal conversation", 
"courtroom discourse", and others for formal speech settings. One 
problem that may arise here is that an individual may behave 
informally in a supposedly-formal setting, or vise versa. Even though 
Japanese linguistic behavior is significantly influenced by highly 
structured honorifics, speakers' language use is not completely 
automatic in a given speech situation. Within an acceptable range, 

126 


there are variations in situational use of honorifics (e.g. Ikuta, 1983; 
Wetzel, 1984; Dunn, 1992, 1996). "Affection" between the speakers may 
override the status difference and realize informality out of formal 
environment, or "ill feelings" may bring forth an entirly informal-
style conversation or ultra formal language. Therefore, to make the 
analysis simple, alongside with the distinction between objective 
formal/informal types of speech situation, I paid attention to the 
formal/informal sentence-ending forms that informants used. 

Japanese plain sentences for informal conversation end with 
verbal and adjectival dictionary forms, or copula -da (present tense) and 
-datta (past tense) after noun and adjectival-noun, or their related 
forms (e.g. negative forms). Japanese polite sentences end with either 
verbal endings of -masu (affirmative present) and -mashita (past 
tense), or the copula forms of -desu (present) and -deshita (past), or 
their related forms. Usually, these polite sentence-endings are 
considered to be a form of honorifics known as "performative 

honorifics" (or "addressee-oriented honorifics").4 When a speaker used 
polite sentence-endings for most of a discourse, I understood that the 
speaker considered the conversation to be formal for himself although 
the degree of formality largely varies. I used this criterion for 
grouping discourse types. However, I was also aware that one particular 
usage of honorifics does not indicate a unique social context. For 
example, plain form speech can be used by a speaker to a lower status 

127 


addressee as well as to an equal status addressee. Addressee-oriented 
honorifics (i.e., polite sentence-endings) can be used by the speaker to 
addressees of lower, equal, and higher level. This indicates that a 
speaker's decision to use either "plain" or "polite" form involves other 
factors of "perceived distance" between himself and the speech situation 
besides the addressee's status. Therefore, it must be true that one 
particular social context may require one particular level of honorifics 

(e.g. a formal discussion with equal level addressees requires the 
speaker to use polite level of honorifics), but the reverse is not always 
true (e.g. the use of polite level of honorifics does not always indicate 
that the speaker speaks to his equal level addressees). The following 
table [4-2] indicates the relationship between speaker-addressee's 
social-status relationship and the possible use of plan, polite honorifics, 
and hyper-polite honorifics in spoken discourse: 
[4-2] Possible grammatical forms of Japanese and types of addressee 

lower-status 
addressee 
equal-status 
addressee 
higher-status 
addressee 
plain form yes yes no 
polite form 
(performative 
honorific) 
yes yes yes 
hyper-polite 
form 
(performative 
honorific) 
no no yes 

128



Therefore, the polite form of honorifics as well as the plain form 
does not have decontextualized social meanings. This means that a 
speaker's decision to use the "plain" (informal) or the "polite" (formal) 
form indicates his integrated perception of the nature of a given speech 
situation. 

It is also necessary to clarify the "unit" of analysis. In this 
research, a "sentence" is regarded as a unit. A sentence is often 
considered as unsuitable as a unit of speech. For example, in her 
research on discourse markers, Schiffrin (1987) pointed out that the 
sentence structure and the meaning of a "speech act" are not relevant to 
each other, and suggested that "interactionally situated language use is 
sensitive to constraints quite independent of syntax." Schiffrin 
concluded that "sentence structure is not the most useful unit to 
understand language use and social interaction" (1987:32). This may be 
true for many conversation/discourse analyses on interactional 
meanings of language use (e.g. turn-taking, silence, hedges, back-
channeling). This dissertation is also about interactional language 
behavior; however, this research views the issue from the sentence 
form, in particular, from sentence-ending morphological forms. 
Therefore, treating the sentence as a unit of analysis is inevitable. 
Unfortunately, spoken sentences are often so incomplete that 
identifying sentence boundaries is often difficult (e.g. Crystal, 1980). 
This is a very critical problem in the Japanese language; the sentence

129 


ending is often intentionally omitted in Japanese to make the modality 
ambiguous. The following conversation shows examples of incomplete 
sentences. 
(4-3) 
F5(1): Nani sore .

 What that? (What is that?) 
F2(2): Nani gasu tte-iu-n-dakke . 
What gas QUOTE-n-Q (What was the gas called?) 
F3(3): Wakannai kedomo, VH toka, dokugasu... 
don't know but VH(PROP) something like poison gas... 
[I don't know but poison gas as like "VH"...(incomplete).] 
F2(4): Nanka sono gasu o sutta dake de moo shin-jau... 
somewhat that gas NOM inhale only INS soon die-(regret)... 

[Something like, only inhaling the gas [regretfully] kills 
people..(incomplete).] 

F3(5): Dakara chuushaki . o hito no soba de pyutto yatte... 
so syringe NOM people POSS side LOC ONOM do (te) 

[So, with one squeeze of syringe beside people..(incomplete).] 

F2(6): Dakara moo hito 
so only one 
tare 
drop 
yo. 
VOC 
(7): Pon-tte 
ONOM (dripping) 
tadrop 
raseba 
COND tsono gasu ga 
hat gas NOM 
yoosuruni 
in short 
nannte 
how 
iuno, 
say 
kuuchu 
in the air 
ni 
LOC 
kakusan- sarete.. 
scatter PASS(te) 

[So, it's only one drop. If dropped (with onomatopia sound), 
that gas, in short, how can I say, is scattered in the air 
..(incomplete).] 

In the conversation (4-3), which is informal, sentence (1) and (3) 

130 


end in nouns without verb-endings. Sentences (5) and (7) end with teforms of verbs that suggest the sentences are not completed yet. As 
noted in chapter one, te-form of a verb means "action and~" or 
"progressive action" (e.g. Makino and Tsutsui, 1986) and therefore 
connotes the "incompleteness" of action or the "state of being"; 
therefore, it is ungrammatical to end a sentence with te-forms 
according to Japanese prescriptive grammar. In short, sentences (1), 
(3), (5), and (7) in the discourse do not have clear modality at the 
sentence end. This avoidance of the sentence-ending makes "periodless" sentences that produce a "fading-out" effect. A Japanese sentence-
ending modality expresses the speaker's psychological attitude toward 
the context of the speech; he can show, for example, to what degree he 
commits himself to his statement. Therefore, it seems quite logical to 
assume that individuals use avoidance of clearly-formed sentence-
endings as a strategy to express some degree of reservation toward their 
propositions (cf. also the case of te-likage in chatper one, note 4). 

In this research, attention was paid only to completed sentences 
with sentence ending modalities, although some incomplete endings are 
exceptionally considered to have modalities as will be explained later in 
this chapter. 

THE LANGUAGE 
In this research, the target vernacular is "standard" Japanese 

131 


(i.e., Tokyo dialect).5 Even though Japan is geographically a small 
country (smaller than the state of California), the Japanese language 
has hundreds of regional dialects. Some dialects are remarkably 
different from others phonologically, lexically, and morphologically to 
the extent that communication problems can occur among people from 
different areas; while other dialects are not very distant from the 
standard vernacular (e.g. Sanada, 1983; Kindaichi, 1977; Sato et al., 1986). 
In the Tokyo area, basically standard Japanese is spoken, but regional 
dialects are also heard.6 As noted earlier, the data from American sites 
for this study were obtained from Japanese speakers who resided in 
Madison, WI and Austin, TX. Informants' origin in Japan varied widely. 
The data collection in the summer of 1996 was carried out in the Tokyo 
area, but the informants' native dialects were also diverse. In both 
America and Japan, most of the informants used the standard Japanese, 
but there were some informants who used their native dialects. If we 
assume that Japanese linguistic epistemology and culture are related, it 
is necessary to look into both standard and regional languages to see if 
their systems of evidentiality marking share the same concepts. 
However, in this research, standard forms have received the primary 
focus while the attention paid to regional differences is minimal. An 
effort is, however, made in this study to make some reference to nonstandard utterances. Non-standard dialect speakers usually learn the 
standard dialect through institutions (e.g. schools) and other 

132 


environmental factors such as media and human contacts. All Japanese 
speakers are assumed to understand standard Japanese, and a large 
proportion of native speakers of non-standard Japanese are perhaps 
practically "bidialectal". Since no significant difference was found in 
sentence modality between native and non-native standard dialect 
speakers in the data, I speculated that learners of the standard dialect 
perhaps learn the pragmatic rules of evidentiality coding as a part of 
the patterns of the Tokyo dialect, or that the major dialects share a 
common concept of evidentiality marking. If a unique pattern of 
evidentiality marking is seen systematically in certain non-standard 
dialect speakers' standard Japanese, it is possible to assume that the 
phenomenon is a "transfer" from their native dialect. Unfortunately the 
dialect issue is too far beyond the scope of this study; there are simply 
too many different dialects and the boundaries between them tend to be 

fuzzy.7 For these reasons, possible differences in evidentiality coding 

among regional dialects was not seriously pursued in this research. 

THE DATA 

I transcribed discourse utterances with attention to each word, 
complete or incomplete. Attention in transcription was not paid to 
phonological aspects such as variation of phonemes, nor most of the 
aspects of conversational pragmatics such as "timing of speech", "silent 
or hesitated period", "length of pronunciation", "overwrapping speech" 

133 


other than "intonational patterns". As to intonational pattern, careful 
attention was paid to the sentence-final intonation: rising, falling, flat, 
or other. These intonational distinctions are described in case the 
pattern affects the evidential meanings of the sentence final forms. For 
example, Japanese sentence final particle -ne, which often functions to 
indicate the speaker's awareness of shared status of his proposition with 
the hearer, is considered to have several different intonational tones 

(e.g. Oishi, 1985). It was assumed that a subtle tone difference may 
indicates significant difference of evidentiality meaning reflecting a 
speaker's cognition of the reality. 
THE DATA ANALYSIS 
Scope of analysis 

As the overall scope of this study is clarified, although there are a 
variety of evidentiality codings, in analysis, attention was paid 
primarily to the sentential-ending form which is the main linguistic 
issue of this research. Other types of modality expressions which 
involve evidentiality aspects (e.g. "deixis", "adverb", "incomplete 
sentence", and "hedges") were also analyzed in relation with the 
sentence-ending forms. For example, occasionally when a sentence-
ending form does not involve modality of indirectness or low-
assertiveness of the speaker, other types of modality are often 
substituted to produce a low-assertive mood in the sentence. An example 
is shown below: 

134 


(4-4) 

F3: 
nannka jyuu-nenn mae no karute ga mada 
somewhat 10 years before MODI medical chart NOM yet 

nai-tte sawai-deru.
does not exist-QUOT fuss(te-form)-STAT


(F3: Somewhat [they] are clamoring saying the medical chart [of 
AIDS patient] of 10 years ago has not yet been found.) 

In the utterance (4-4), the speaker's topic belongs to the genre of 
public information that is not in her information territory. She used 
the bare direct-ending sawaideru (fussing), without incorporating 
addressee-conscious final-particles although the proposition was 
assumed to be known by her hearers. The sentence-ending modality of 
the utterance may be too direct from the standard viewpoint, but the 
words at the beginning nannka (somewhat) functions to mitigate 
commitment expressed by the speaker to the proposition. Other examples 
with lexical modality of indirection are sentences with adverbs such as 
tabun (probably), osoraku (probably), and toka-nanntoka (something 
like that). Syntactically, negative and passive forms are used for the 
same purpose. Prosodically, changing tones provides a way to do so 
without making sentence-ending forms less assertive. However, as 
noted earlier, the sentence-ending form provides the most dominant 
modality with the sentence (cf. Chapter three). 

Method of analysis 
There are three factors involved in the analysis: (1) frequency of 

135 


occurrence and the type of sentence-ending evidential form, (2) 
propositional content of the sentence, and (3) speech situation in which 
the sentence was uttered. Quantitative analysis was carried out through 
the creation of a database containing a representation of each relevant 
speech ending in this study. This data was then analyzed by writing a 
series of computer programs to extract various patterns in this data. 

The database is conceptually a collection of 7024 speech 
utterances which have the following information associated: 

(1) Informant identification (sex, age) 
(2) Discourse type/group setting (formal conversation, informal, 
family, courtroom, school, public) 
(3) Sentence-ending forms 
(a) Group identification for the forms (1-10) 
(b) Formal form (polite form)/informal form (plain form) 
(c) Ascending tone/descending tone 
(4) Information (proposition) type (A-H) of each sentence 
The computer programs used to analyze this data were written 
according to my specifications in PERL Version 5.003 on an IBM RS/6000 

workstation running AIX version 4.2.1. PERL was chosen since it is a 
widely available language with powerful regular expression 
manipulation and associative arrays. 

Sentence-ending evidential forms 
The following [4-5] is a summarized list of the sentence-ending 

136 


evidential forms for both informal and formal forms that occurred in 
the data. The completed list of all forms (approx. 350 forms) is in 
Appendix B. For the purpose of systematic and realistic analysis of the 
entire data, the list was created based on the theoretical background 
attributed to each form as well as early-stage analysis of the actual data. 
For convenience, prior to the detailed analysis, I classified them into ten 
different groups according to their syntactical and morphological 
forms. The largest distinction is made among "direct" (D), "indirect" 
(ID), and "question" (Q ) forms. Direct-ending-forms were further 
divided into five groups following the types of suffixed sentence-final 
particles or other final lexical items as well as intonational differences. 
One group consists of direct forms with questioning tones (DQ "directform question"), some groups involve the direct forms showing the 
speaker's sensitivity to the hearer's knowledge (SD "semi direct") 
through tag-question style, etc. Indirect forms are divided into two 
groups, hearsay and inferential evidentials. Epistemic-auxiliary 
-ending forms (AUX) and "I think"-type ending forms (THINK) are 
indirect forms, but grouped separately from the hearsay and inferential 
forms. In doing so, my intention was also to classify the final forms by 
their degree of estimated assertiveness. 

[4-5] Japanese sentence-ending evidentials 
8 
English 
equivalent 

Group 1: D Single-noun-ending, 

137 


D Direct-form, DIRECT 
D Direct-form with sentence-final particles 
such as -yo, -wa,, -sa, -no, and -wake, -kara, 
-node 
and related forms. 
Group 2: D Direct-form with the sentence-ending 
particles -ne and -na with falling tone 
(-ne . and -na . ) 
and related forms. 
DIRECT(getting 
attention) 
Group 3: SD Semi-direct-form with auxiliary 
"confirmation-daroo . " (falling one) and 
negative suffix -janai . (falling tone) 
TAGQUESTION.
and related forms 
Group 4: DQ Direct-Question-form with sentence-final 
particle-ne with rising tone (-ne .), and 
"confirmation -daroo ." and negative 
suffix, -janai . with rising tone. 
DQ Quasi-question forms 
TAGQUESTION.
NEGATIVE 
QUESTION. 
and related forms 
Group 5: SD Semi-direct form with the particle -ne# TAG

(with rising + falling tone) QUESTION 
(as we both know) 
and related forms. 
Group 6: Q Question forms with a question particle QUESTION. 
-ka, or -no 
and related forms. 
Group 7: ID Inference forms such as -mitai, -yoo, 
and -rashii, IT APPEARS 
IT SEEMS 
and related forms. 
Group 8: ID Hearsay evidential forms such as I HEARD 
-datte, and -soo, 
138 


and related forms. 

Group 9: AUX Epistemic auxiliaries such as 
-kamoshirenai, 
-hazu, 
"conjecture -daroo", 
MIGHT BEMUST BEPROBABLY 
and related forms 

Group 10: ID 'I think" forms such as -omou, I THINK 
and -kangaeru. 

In [4-5], most of the D (direct form), Q (question form) and 
ID (indirect form) endings have both informal and formal forms. 
For example, the direct affirmative non-past informal ending for to eat 
is (in context) taberu, and the formal ones are tabe-masu (addresseehonorific), itadaki-masu (humble), meshiagari-masu (hyper honorific), 
otabe ni narimasu (hyper honorific) and possibly others. No 
formal/informal distinction is made for sentence-final particles such as 
yo, sa, na, wa, no, and ne, therefore, when the ending is suffixed with 
a particle, the form of the verb, adjective, or copula before the particle 
is either formal or informal. 

Most ending-forms have a version with the particle -n (or -no ) 
inserted after the direct forms of V erb, Adjective, or N oun before the 
ending copula -da (-desu for formal) constituting a V/Adj/N + n+ da 
cluster. These forms are listed on the right-hand column in the list in 
appendix B. Particle -no with this function is called the "nominalizing" 
particle which is claimed to have an evidential function (Aoki, 1986 in 
chapter two). Kuno (1973) says that patterns of this type of-no da (or 

139 


-n da) cluster, give some "explanation" for the speaker's propositional 
context for declarative sentence, and for interrogative sentences, -no 
desu ka? (with question particle -ka) asks of the hearer's explanation 
for what the speaker has heard or observed as (4-6) example shows. 

(4-6) 
M8(1): naiyoo 
context 
wa 
CONTomoshiroi 
interesting 
desu yo. 
COP(formal) PART(VOC) 
(2): rabu ni 
love DAT 
kansuru 
relate 
koto 
COMP 
desu kara. 
COP(formal) because 
(3): uke 
popularity 
o 
ACC 
neratte-ru-n-desu 
aim(te-form)-STAT-n-COP(formal) 
yo. 
(VOC) 

M8 (1): Context [of my dissertation] is interesting (I am telling 

you). 
(2): Because it concerns love. 
(3): [because]I am hoping to be well received [by readers] 

(I am telling you). 

(4-6) Utterances are part of the discourse in which M8 was 
explaining the research topic of his dissertation. In (3), he said that he 
decided on the topic expecting people's curious attention. This utterance 
gives explanation for his previous utterance (1): the topic is interesting. 
The following discourse is an example of a n-da cluster in interrogative 
sentence: 

(4-7) 

M13(1): sore wa chotto ikura sooseiki no terebi da 
that TOP little even initial-stage MODI TV COP 

to ittemo amari nai deshoo. 

QUOT-COND not many exist(NEG) AUX(CONF) 

140 


F22 (2): uun... maa naku mo nakatta desu 
Well well exist(NEG) exist(NEG)(PAST) COP(FOR) 

ne. 

PART(RAPP) 

M13(3): aru-n-desu 

ka . 
exist-n-COP(formal) Q 
M13(1): Even though it was one of the initial-stage TV programs, 
that did not happen often (didn't it?) 
F22(2): Well, it is not that [the things like that] did not happen. 
M13(3): Did it happen (as you said)? 

Here, M13 and F22 were discussing a "live" TV soap drama of some 
twenty-five years ago in which unplanned replacement of main 
characters was carried out without informing the viewers. In (1), M13 
was thinking that such an occurrence must have been unusual. F22 has 
more experience in the field and said it was not unusual in (2). M13 
requested more explanation from F22 in sentence (3) by simply using 
the -n-desu-ka? cluster. From the perspective of evidentiality, n-da in 
M13's utterance (3) suggests that the utterance is based on the evidence, 
i.e., the utterance (2) from F22. 

McGloin (1980) further developed this analysis of -n da and 
argued convincingly that a speaker uses the -n da cluster to 
subjectively explain, to persuade, to convince or to give background 
information in a situation where certain information is known by both 
parties, or either the speaker or the hearer. Kuno and McGloin's 

141 


analysis can be interpreted to mean that the -n da expression is 
concerned with (1) sharing information between two parties (from the 
speaker to the 'ignorant' hearer), (2) checking the truth value of the 
speaker's information with the resourceful hearer, or (3) confirming 
the shared status of the information between the two parties. Therefore, 
-n da clearly functions as an evidential in various ways. McGloin also 
found that "in purely objective information giving/seeking situation, 
no desu cannot be used" (1980: 144) suggesting the subjective nature of 
the particle -no which asserts that the speaker's proposition is 
supported by evidence. 

More explanation of group-by-group sentence-ending evidential 
forms which were summarized in [4-5] are provided below: 

(Group I sentence-final evidential forms) 

The first group of the sentence-ending forms (Group I) is 
assumed to be most direct forms used in Japanese, and accordingly is 
considered appropriate for presenting any information to which the 
speaker attaches high truth value. Theoretically, the first listed form, 
noun-ending, is not a completed sentence ending so it should not be of 
major concern to this study. However, it was observed in casual 
conversation, family discourse in particular, that the simple noun-
ending was used too often to be ignored. So, I listed incomplete endings 
with a noun as a kind of direct modality form. Direct ending is the 
plain forms of the verb, adjective, and copula without any suffix. 

142 


In conveying information which is truthful from the speaker's 
viewpoint, however, in many instances, speakers who are sensitive to 
the existence of hearers may consider plain direct-forms to be too "uninteractional" and add some kind of sentence-final particles or other 
kind of modality expressions to their proposition to create different 
types of direct mode. As briefly noted in chapter three, sentence-final 
particles are hearer-sensitive and, like -n da clusters, are not used in 
formal Japanese writing or formal public speech which does not assume 
a specific audience (eg. Saji, 1956). Each particle is said to connote some 
kind of conversational nuance from the speaker to the hearer. It is 
very difficult sometimes to translate the meanings attached to the 
proposition by the use of final particles, so they are often left 
untranslated in other languages. As noted in chapter three, it is said 
that the particle -yo, and -sa function to "impart information which 
belongs to the speaker's sphere to an addressee" (McGloin, 1990), 
"forcing the speaker's view on to the hearer" (Tokieda, 1951) or 
"focusing on the informational aspect of the proposition" (Maynard, 
1993). Kinsui (1992) said that by using the particle -yo a speaker 
"declares" his intention to input the information (i.e., his proposition) 
into his indirect memory which is reserved for the hearer's assumed 
knowledge (p. 8). Examples of -yo usage are shown in sentence (1) and 

(3) in (4-6). -Sa is used in the same way as -yo although it probably 
connotes masculinity more strongly than 
-yo. 
-Wa, and -no have been characterized in two different ways: 

143 


Ueno (1971) said that they have the same function as -sa, and -yo, 
while McGloin, (1990) considered that -wa and -no create rapport, or 
request sympathy from the hearer. It seems that -wa and -no are, as 
McGloin argued, slightly different from "declarative" -yo and -sa. In 
my analysis, they are not "declaring" but rather "extending" the 
speaker's rapport to the hearer. However, at the same time, it is also true 
that -wa and -no particles convey less sense of rapport than -ne. For 
this research, I included -wa and -no evidentials into Group (1), the 
category of highly-direct-evidential. Therefore, these Group (1) final 
particles are generally speaker-oriented. 

The followings are some examples of -wa, and -no. 

(4-8) 

F5(1)
: 
nihon-tte ima nan-nin kurai eizu kanja ga 
Japan-QUOT now how many people about AIDS patient NOM 

iru ka shitte-masu . 

exist COMP know(te-form) formal 

F16(2)
: 
seikakuni wa wakaranai wa . 
correctly CONT know(NEG) PART(VOC) 

F5 (1)
: 
Do you know how many AIDS patient are here in Japan now? 

F16(2): I do not know precisely. 

-Wa use by F16 in sentence (2) shows a common usage of -wa

 in 
imparting speaker's own state of being. -Wa typically connotes 
femininity (in starndard dialect), as does -no. It is also difficult to 

144 


translate the nuance of -wa and -no into English. 
(4-9)
F6 (1): gakkoo ga owatte, minna de atsumatte,


school NOM finish(te-form) everybody INS gather(te-form) 

ja Sakae e ikoo ze -tte koto ni 

then Sakae DIR go(VOL) PAR(VOC) QUOT COMP DAT 

natte minna de jitensha de kuridasu no. 
become(te-form) everybody INS bicycle INS crowd to 

(2): de machi e itte, chika-gai ga aru no. 
then downtown DIR go(te-form) underground mall NOM exist 

(3):soko e haitte, shabekuru to iu..
. 
there DIR enter(te-form) chat QUOT


(4): sorede ie e kaette syukudai o suru no. 
then house DIR return(te-form) homework ACC do 

F6 (1): After school, [we] all gather, and decide to go to Sakae, and 
everybody goes by bicycle (no ) 

(2) Then, go into the town, there is an underground mall (no ). 
(3) [We] go into there, and talk, 
(4) Then, [we] go home and do homework (no). 
In (4-9), the speaker explained what she habitually did in her high 
school days, therefore, naturally her commitment to the proposition is 
very high. -No ending is used in (1), (2), and (4) sentences. 

I have included -kedo (and -ga) (meaning but) and -kara (and 
-node) (meaning because ) as sentence-final forms although they are 
not usually considered to be so. They are conjunctions and if a sentence 
ends with one, the sentence is, grammatically speaking, incomplete. 

145 


However, the original meanings of these conjunctives are often 
ignored, and they are used to end a sentence in a fading-out fashion 
without clear direct modality. Since, utterances ending with one of 
these conjunctives do not usually entail the hearer's knowledge but 
simply muffle the directness of the utterance, I included these in the 
direct-ending group. An example of -ga use is shown below: 

(4-10) 

F24 (1): eeto chiryoo houhoo no minaoshi o nasatta 
well, treatment method MODI reexamination OBJ did(HON) 

ka dooka to iu koto ni tsuite ukagatte mitai 

whether QUOT COM regarding ask(HON) try(DES) 

to omoimasu ga. 

COM think(formal) but 

(2) 
jiko chuushahoo o hikaeru desu toka ne,
self injection method OBJ refrain COP(FOR) like RAPP
kurio 
e no kirikae, shinsenna toketsu kesshoo o

 domestic medicine DIR MODI change fresh frozen serum ACC 

katsuyoosuru toka desu ne 
,
utilize like COP(FOR) PART(RAPP)


dooiufoona koto o gutaitekini nasai-mashita ka . 

what kind thing OBJ practically did(HON) Q 

F24 (1) Well, I would like to ask if you reexamined your 
treatment (kara ). 

(2) What sort of thing did you do actually in terms of reexamination of treatment of hemophiliacs, such as 
refraining from self-injection, use of domestic blood, 
utilization of fresh frozen serum? 
146 


The use of -ga in above (1) does not have any particular 
meaning. It helps to give an impression that the sentence is less 
declarative. This is the same as the use of sentence final -kara 
(because) or -node (because) as shown below: 
(4-11) 
M1 (1): yuushuuna jinzai dattara oyakusho de mo 

excellent human resource COP(COND) government LOC also 

kigyoo de mo onajiyooni kyosoosite hippattekuru-tte 
company LOC also alike compete(te) recruit-QUOT 

iu no ga soo iu koto ga atte shikaru-beki na-n-da. 
COMP NOM such COMP NOM exist(te) should -n-COP 

F5 (2) soo desu ne. . 
so COP(FOR) PART(RAPP) 

M1: (3) shikamo amerika no baai wa ne . 
moreover America POSS case CONT PART(RAPP) 

dentootekini yakunin ni nattara kyuuryo ga 

historically civil servant DAT become(COND) salary NOM 

sagaru-tte iu no ga aru kara. 

decrease-QUOT COMP NOM exist because 

(4): futuudato maa sukunatutomo ne, 
.
usually well at least PART(RAPP)


daitooryoo ga kawaru tabi ni ue no renchuu-tte 

President NOM change turn TEMP top MODI people-QUOT 

iu no wa kubi o sugekaerareru-tte.

 COMP TOP neck OBJ replace(PASS)-QUOT.
M1(1): If [they are] excellent staff, the government and private 


companies should compete to recruit those people. 
F5 (2): It is so. 
M3(3): Moreover, (kara) in case of America, traditionally, one's 

income decreases if he became a civil servant. 

147 


(4): Usually, well at least, each time a new president is selected, 
high class officials are said to be replaced. 
The sentence ending with -kara in (3) does not denote its literal 
meaning, because: there is no phrase or sentence to be meaningfully 
connected with sentence (3) with the conjunctive -kara. Therefore, 
when talking about American politicians, a topic which is supposed to 
be other people's information, the speaker used -kara, thereby avoiding 
the bare direct-ending of the verb, -aru (exist) in (3). 
The ending form -wake (literally reason) functions in a similar 
way with -n da in extending "explanation" from the speaker about his 
propositional background: 

(4-11) 
F3(1): gakuhi ga zero. 
tuition NOM zero 

F5(2)
: 
zero. ii wa ne#. 
zero good PART(VOC) PART(SHARE) 

F3(3): daigaku made zero yo. 
university till zero PART(RAPP) 

F5 (4): sore zenbu zeikin . 
that all tax 

F3 (5)
: 
zeikin 
tax 

M22(6): sono kawari josei mo yamenakute sumu yooni 
that instead female also quit(NEG)(te) settle in such a way 

hatarakeru kankyo-tte tukutte -aru wake. 
work(POT) environment-COMP make(te)-RES reason 

148 


F3(1) School tuition is free 

F5(2) Free? That is good. 

F3(3): It is free to university (I am telling you). 

F5(4): That is all [paid by] tax? 

F3(5): Tax. 

M22(6): Even though [Swedish people have to pay high tax], the 

environment is well-conditioned to allow females to 

continue working (that is the background of high tax). 

Wake as used in M22's utterance (6) performs the function of 
explaining that the utterance is giving the background information for 
what has just been said. The degree of evidentiality attached to wake-
ending seems high. 

Combined forms of Group (1) evidential ending forms, such as 
wayo and wakesa, also belong to this group. 

(Group 2) 

Group (2) final forms typically involve the particle -ne. Ne and 
-na are said to be used to "seek confirmation from the hearer" (McGloin, 
1990), or "solicit confirmation" (Maynard, 1993), but at the same time, 
ne, and -na, function to create rapport, or request sympathy from the 
hearer (e.g., McGloin, 1990 Tokieda: 1951) or interpersonally to "solicit 
emotional support" (Maynard, 1993). 

It is noted that each of the particles -ne and -na is affirmed to 
have two different functions: "requesting confirmation" and 
"requesting/sending rapport". However, how these two functions are 

149 


linguistically distinguished has rarely been discussed. The prosodic 
features of sentence-final particles seem to have been rarely 
investigated other than by Tanaka (1973, 1977) and Oishi (1985). Oishi 
argued that intonational patterns determine the different functions of 
the particle -ne. He pointed out that -ne (and yone) can be uttered 
with four different tones: 

(1) the 
pitch of the final syllable of the word preceding the final 
particle nee is lower than the pitch of its first vowel /ne/ and the 
pitch of this vowel is higher than the second /e/; 
(2) the 
pitch of the final particle is higher than that of the final 
syllable of the preceding word in one syllable particle; or the pitch 
of the final syllable of the final particle is higher than that of the 
penultimate syllable; 
(3) the pitch of the final particle is lower than that of the final syllable 
of the preceding word in one syllable particle; or the pitch of the 
final syllable of the final particle is lower than that of the 
penultimate syllable, 
(4) no pitch differences between the two identical vowels (/e/) in nee. 
Oishi argued that the discourse meaning of each ne is different. He 
referred to only ne and yone, but this observation must apply to other 
ne-related final particles (e.g. wane ) and the particle na which is 
slightly vulgar version of ne. (p. 60) 

Four different pitch types were also confirmed in my data. 

150 


Taking Oishi's distinction into consideration, I assumed three types of ne 

in my model as described below: 9 

(a) Ne.: Ne with a falling intonation, which is not necessarily 
asking for either confirmation or agreement from the hearer, is simply 
placed by the speaker between phrases or at the end of sentences to 
make utterances interactive by requesting attention and rapport from 
his hearer or by mildly asserting the speaker's contention. So logically, 
and also empirically, a speaker can insert this ne after every word or 
phrase of his sentence. 
(4-12) 

F12(1): nanka ne . aakansoo ni ita toki ni ne . 
something like Arkansas LOC lived when TEMP 

ano hito ne . gabanaa ka nanka datta desho. 
that person Governor something like COP(PAST) AUX(CONF) 

(2)
: 
sono toki ni ne . sekuretarii datta to omou kedo ne . 
that time TEMP secretary COP(Past) QUOT think but 

maa, chotto bijin no ko ga ite ne . 

well a little pretty girl MODI girl NOM exist(te) 

F12 (1): something like (ne), when he was in Arkansas (ne), 
that person (ne) was the governor or something like that 
(wasn't he?) 

(2): At that time (ne), I think that was his secretary (ne), 
well, there was a cute girl (ne). 
Characteristically, this -ne seems to be related with the 
information that belongs to the speaker's territory, and is not known by 
the hearer. I call this -ne "rapport -ne". Some small proportion of 

151 


the speakers habitually pronounce this -ne as a short rising sound. This 
use of rising -ne functions as if the speaker is asking "Are you 
listening?" to the hearer in conveying information that is likely 
unknown to the hearer. This rising version of "rapport -ne" is easily 
distinguishable from the real "rising -ne" (the second type -ne) because 
it obviously does not involve the speaker's concern about the hearer's 
knowledge. I decided to group these two types of "attention-getting -ne" 
into the same category because the evidential function of the both ne's 
is the same: to get the hearer's attention or sympathy to his proposition. 

Falling -na has the same function as the falling rapport -ne. 
(4-13) 
M1 (1): friitaa. 

"freeter" (self-employed person usually working independently) 

(2): friitaade ne. 
freeter-(te form) PART(RAPP) 
(3): de, kekkyoku syuushoku mo sezu ni ne . 
then after all get a job even do(NEG)-adverb PART(RAPP) 

jyuu-nen bakari asonda-n-da na . 
14 years about had leisure-n-COP PART(VOC) 

(4)
: 
nanka kissaten no keiei ka nanka 
somewhat coffee shop MODI management or something 

yatteta-n da na. 
did (te-form)STAT -n COP PART(VOC) 

M1 (1): "Freeter." 
(2): [He was] a "freeter" 
(3): Then, after all, he did not get a solid job [as every university 

graduate does immediately after graduation] and had a 
leisure time for about 10 years (na). 

152 


(4): [He] did something like managing a coffee shop (na) (this 
refers to my previous utterance). 

In (3) and (4), -na is used sentence-finally. The speaker was 

talking about a Japanese author's personal information. The use of -na 

in (3) and (4), as well as ne. in (1) and (2), suggests that the speaker 

assumed the hearers did not know the information (so he was informing 

the hearer of what he knew). Now, we turn to the second type of -ne. 

(b)Ne .: Ne with a rising intonation is often used by a speaker 
to ask for confirmation on the truth value of his proposition from the 
hearer. Therefore this -ne is often used for the proposition which is 
assumed to be known by both parties. This -ne often sounds like a 
question because the speaker's surface intention is to ask for the 
hearer's agreement. The major evidential function of this -ne is to 
confirm that both parties have the same information in either one's 
information territory or simply as knowledge. I call this ne 
"confirmation -ne". 

In the following example, a school teacher was asked by a student 
to change what the student had written on the board, and the teacher 
changed the writing and then tried to confirm her understanding of the 
student's meaning: 

(4-14) 
F25: koo iu fuu ni kakikaeru to iu koto desu ne . 
this way like rewrite QUOT COMP COP(formal) CONF 
F25: It is said to rewrite this in the way like this (am I right?) 

153 


In a sense, this -ne . functions in a similar way as the question 
particle -ka. The difference between the two is that -ka is used for a 
question for which the speaker is supposed not to have an answer; the 
proposition is not in the speaker's territory or knowledge. Next -ne (the 
third one) involves the hearer's knowledge more deeply than the 
confirmation -ne. 

(c) -Ne#: Third type of -ne is the one with an intonation that 
first rises then falls, and usually pronounced longer that (a) or (b) type 
-ne , or with a flat prolonged intonation without falling. This -ne is 
characteristically used to end the proposition which the speaker knows 
to fall into both parties' information territories. I call this -ne 
"sharing-ne". 
From the viewpoint of discourse management, this -ne functions to send 
the sense of camaraderie, or in-group intimacy in sharing information, 
and functions evidentially to show that the truth value of the speaker's 
proposition is fully acknowledged between both parties. In the 
following example, (4-15), the speaker and the hearer were talking 
about the hearer's shadow-picture products, and since they were both 
observing these products at the time, they were actually sharing the 
same experience which enhances the use of "sharing-ne# :" 
(4-15) 
F22: kore wa ari desu ka. Ari to kirigirisu no

 this TOP ant COP(FOR) Q Ant and grass-hopper MODI 

o-hanashi desu ka . 

HON-story COP(FOR) Q 

154 


kore nannka mo zuibun komakai desu ne#
this also very fine COP PAR(SHAR)


F22: Is this an ant? Is this the story of ant and grass-hopper? 
This one is also very finely-cut (as we both can see). 

Although Kamio (1994) emphasized the importance of -ne as a 
pragmatic discourse marker, he discussed one general -ne which is 
obligatory when being used for information that belongs at least to the 
hearer's territory. Takubo and Kinsui's theory also considered -ne as 
one general concept, in that -ne confirms the sameness of existing 
information in the speaker's memory and the hearer's memory area, 
i.e., type (b) and (c) -ne in this study. However, considering the 
concept of evidentiality coding, these three types of -ne must be 
differentiated. 

There are individual differences in -ne pronunciation and some 
people prefer one type of -ne over others regardless of the 
propositional type. However, generally, it seems that a high proportion 
of informants had these three types of -ne. Each type of -ne was often 
used independently as if it were deictic and representing the sentences 
which were spoken before. Observe the following examples: 
(4-16) 
F5(1): de souru daigaku-tte arimasu -deshoo . 

then Seoul Univ. QUOT exist(FOR)-AUX(CONJ) 

maa kankoku no toodai, asoko ni hairu no wa 

well Korea POSS Tokyo Univ. there DIR enter COMP TOP 

155 


kankoku de wa ichiban 
Korea LOC CONT primary 
no 
MODI 
eiyo 
honor 
rashikute 
AUX (te)(it seems) 
F18(2): un 
yes 
un 
yes 
rashii 
seems 
F5(3): ima, 
now 
nihon wa 
Japan CONT 
soo de mo nai-deshoo . 
so COP NEG AUX (CONF) 
sorehodo 
such degree 
de 
COP 
mo mukasi 
old times 
hodoja-nai 
degree(NEG) 
to 
COMP 
omou-n-desu. 
think-n-COP(FOR) 
nannka 
somewhat 
sugoi 
extreme 
mitai. 
seem 
jisatusha mo ooi 
suicide also 
mitai. 
many seem 
F18(4): nee# 

F5(1) Then, there is a university called Seoul University, as you 
know. It is like Tokyo University of Korea, it seems very 
difficult to enter there, 

F18(2): Yes, it looks like so, 

F5(3): Isn't Japan as bad as before (regarding the entrance 
competition into the Tokyo University)? I think the 
situation is not so bad as old times. It seems that 
(competition to enter the univ in Korea) is very hard. 
It looks like there are a lot of suicides. 

F18(4): nee# (Yes I agree it does so.) 

In this conversation, in F18(4), the speaker uttered "sharing -ne" 

only meaning she shares the information presented by F5(3). 

"Sharing-ne" represents Group 5 endings. Group (2) ending forms are 

mostly "rapport -ne" and its related forms. 

156 


(Group 3) 

This is a group of semi-direct (SD) forms. Important forms in this 
group are the auxiliary "confirmation-daroo." (-deshoo. in polite 
form) with falling intonation which is almost equivalent to English tag-
question, isn't it., in effect, and -janai. (or dewa nai) with a falling 
intonation which also functionally similar to English tag-question, isn't 
it. 
(4-17) 
F1 (1): video wa itsu miru no .

 video CONT when watch Q 
F2 (2): watashi yoru nechau hito dakara, 
I night sleep(regret) person because 
video mitete mo nechau kara. 
video watch(STAT) also sleep(regret) because 

(3): Un, dakara, asa 
Yeah, so morning 
miru 
watch 
no.
PART(VOC) 
(4): de doyoobi 
then Saturday 
wa 
CONT 
okeiko 
teaching 
ga 
NOM 
atta- ri suru kara 
exist-(etc.) do because 
kekkyoku asa hayaku okite, osooji toka-tte iroiro 

after all morning early rise(te) cleaning etc-QUOT various 

shinakya naranai desho. . 
do-obligation AUX (CONF) 

F1 (1): When do you see videos? 
F2 (2): Because I sleep (early) at night, I fall asleep even 
when I am watching movie videos, 
(3): Yes, so, I watch them early morning. 
(4): Then, on Saturdays, I have students or something, 
therefore eventually, I wake up early in morning and 

157 


have to do laundry and other things, (don't I .) 

F2 in (4-17) talked about part of her life-style: she watches 
movies in the morning. Since, the proposition is her own information, 
she did not need to ask for hearer's agreement on it; the direct 
sentence-ending for (4) is perfectly acceptable. However, how F2 
spends Saturdays as a house-wife who teaches flower-arrangement on 
those days is not beyond the hearer's imagination given the fact that 
F(2) and her listeners are close friends. Moreover, doing laundry and 
cleaning in the morning (everyday) is a well-shared Japanese wives' 
daily schedule. In this way, "confirmation-desho." is often used to 
express the speaker's information which may be known by the hearers. 
Negative suffix -janai seems to be used in the same way as in (4-18): 
(4-18) 
F7: (1) tonari ga juuniji kara sutereo ookiku kake-dashita

 next door NOM 12a.m. from stereo loudly play-started 

no ne..

 PAR(VOC) PAR(RAPP) 

(2) 
urusai toka omotte, jibun de iu no mo sankai me toka 
noisy like think(te) myself INS say COM three times like 
yonkai me toka onnaji koto o iu no iya -janai .. 
four times like same thing ACC say COMP don't like- (NEG) 

(3) 
dakara furonto ni denwashite ano urusai-n desu yo 
so front DIR call(te) well noisy-n COP(FOR) VOC 
nannte ittara "We'll send somebody up" toka itta 

something like said(COND) 
like said 

kara sutaffu ga kuru no ka na toka omottara 

because staff NOM come COMP wonder like thought(COND) 

158 


ikinari 
suddenly 
don don 
bang bang 
toka 
like 
itte 
say(teomawarisan 
) police officerga 
NOM 
kicchatte. 
came(regret) 
(4) majison poliisu. 
Madison Police 

(5) 
de watashi ga repootoshiteru janai . 
then I NOM report(STAT) (NEG) 
(6) tonari no heya no ruumu-meito o. 
next door MODI room MODI room-mate ACC 
(7) 
watashi-tte meen-janai. 
I -QUOT mean-(NEG) 
F7: (1) My next door neighbor started to listen to music loudly 
from twelve midnight (rapport -ne). 

(2) I thought it was noisy or something, it was embarrassing 
to complain three, four times [to the neighbor] myself 
(isn't it.). 
(3) So, I called the front desk [of the dormitory apartment] and 
said [the neighbor was] noisy, then [they] said "we'll send 
somebody up" or something, so I thought the staff might 
come, then suddenly, bang bang bang [at the door], then 
policemen came (te-incomplete). 
(4) Madison Police. 
(5, 6) Then, I am the person who reported on the roommate 
(aren't I .) 

(7) 
Aren't I mean? 
In explaining how she reported on her own roommate to the 

police in effect, the speaker used -janai. (isn't it.) in ending sentences 

which the hearer can reasonably identify with himself: it is 

159 


understandable that to complain repeatedly is embarrassing (sentence 
2), and the hearer has already been informed that the speaker is the 
person who reported the case (sentence 5). -Janai. is the contracted 
form of de-wa-nai (S + copula + contrastive + negative). Although this 
form does not function to negate the proposition which it is attached to, 
its surface syntactical structure implies that S (i.e., proposition) is 
understood information among conversationalists. 

Group (3) ending forms are called "semi-direct form"s (SD) in this 
research. 

(Group 4) 

The "rising-ne " belongs to the Group (4) sentence-ending 
forms which generally are used for expressing the speaker's intention 
to request the hearer's agreement. "Rising -janai." (isn't it. or 
negative question) and "rising daroo." (isn't it.) give the impression 
that the speaker is asking a question to the hearer. These forms are also 
semi-direct forms, however, since the forms of this group are direct 
with an obvious questioning intention of the speaker, I call thee forms 
"direct-question forms" (DQ forms). Therefore, the ending forms in this 
group are likely used as evidentials to propositions which are known by 
both parties. An example of this rising -janai. is seen in (7) of (4-18). It 
is different from the falling -janai in the same discourse. In (7), the 
speaker is really asking if the hearer agrees to the proposition that the 
speaker is mean. The following discourse shows a case of rising deshoo. 

160 


usage: 

(4-19) 

F27 (1) eizu happyoo shita hito sukoshi wa 
AIDS announcement did person little CONT 

enjosareta-n deshoo
.
helped(PASSIVE)-n AUX (CONF) 


(2) 
sorede jibun ga eizu-datte juukyu-sai no nantoka -iu... 
then oneself NOM AIDS-QUOT 19 years old MODI somebody-QUOT 
F5 
(3) Aa, sono hanashi yonda. 
Yeah, that story read(PAST) 

(4) 
otoko-no-ko deshoo 
. 
boy AUX(CONF)
(5) 
ano hito nannka kawaisoo janai 
.
that person somewhat pity NEG
F27 (1) Those people who declared that they caught the virus (from 
the blood-forming medicine) have been helped at least a 
little, haven't they? 

(2) So 19-years old one said he has AIDS.
.
F5 (3) Oh yes, I have read that story.
(4) That is a boy, isn't he? 
(5) That person is, somewhat, miserable, isn't he? 
I believe that the argument that ending forms with rising 
intonation (i.e., -ne., -deshoo. and -janai.) without question-particle 
(ka?) belong to this group is intuitively appealing. The speaker uses 
the rising tone to ask if his proposition is right in light of the hearer's 
knowledge but he does not use -ka because it is not a genuine question; 
the speaker also has the information. 

161 


Also sentence-medial or final use of rising intonation, which I 
call a "quasi-question" is included in this group. Lately, sentence-
medial and final rising tone of phrases/words in declarative sentences 
are very popular among young speakers. A good example is (1-1) the 
discourse excerpt cited at the beginning of this dissertation: 

(1-1) 
F2 (1): A, soo. 
Well so 

(2)
: 
ano hito ga ichiban nan-te iu no ., yoosuruni tsukutta . 
that person NOM most how-COMP Q in short made 

(3): sarin o sukutte yoosuruni jibun de maita-tte iu ka. 
Sarin OBJ make(te) in short oneself INS scattered-COMP or 

(4)
: 
yoosuruni kagakusha . 
in short scientist 

(5)
: 
hotondo ga daigaku no toki ni soo-iu bunnya o 
most NOM university MODI time TEMP so-QUOT field ACC 

senmon to shite yatteta hitotachi . da kara tabun 

major DAT make(te) did people therefore probably 

tabun-tte iu ka yoosuruni kenkyuu .

 probably-QUOT or else in short research 

F2(1): Well, it is so. 
(2): that person did, the most, what shall I say, in short, made 

(Sarin gas)? 
(3): He made Sarin, and, in short, shall I say he scattered himself? 
(4): In short, a scientist? 
(5): Most of them studied that kind of field as their major in their 

university days, so probably, shall I say probably, in short, 
research? 

162 


F2 used a rising intonation at the ending of phrases and 
sentences which makes the declarative sentence sound like a question 
without an explicit question marker -ka (i.e., sentence final -ka in 
Japanese). But the speaker was not posing questions. This use of rising 
intonation at the end of, and also within, a non-question sentnece is 

novel among speakers of Japanese.10 The phenomena was very new to 
me in 1996, so I had opportunities to discuss this issue with my friends in 
Japan. It seems that a speaker uses a rising tone for his sentence or 
some words within the sentence to express, on the surface, that he is not 
confident in his proposition or selection of lexical items. I understand 
that this "untraditional" rising tone produces an effect of modesty; with 
the rising tone, the speaker pretends to ask his hearer's agreement to 

what he is saying. In this sense, the quasi-question sentences or 
phrases are substituting the traditional sentence-ending such asjanai., or -deshoo. 11 At least this new "fad" phenomenon indicates that 
intonation can be an evidential marker. 

(Group 5) 

Group (5)'s main ending-form is the "sharing -ne#" which is 
most likely used as an evidential for fully shared information among 
speakers as noted earlier. Usually, a sense of camaraderie is emphasized 
in the use of ne. The forms in this group are semi-direct forms (SD). 

163



(Group 6) 

Group (6) contains question endings which involve the question 
particle, -ka (polite sentence) and -no (casual sentence). Some 
question forms with falling intonation are not pragmatically intended 
to be questions to the hearer. It seems that the speaker uses these 
falling-tone question endings to pretend to be modest enough to ask the 
hearer's judgement of the truth value of his proposition. 

Question sentences with a rising tone are normally seeking for 
the information which the hearer is assumed to have. Therefore, Group 

(6) ending forms are likely to be used for the hearer's information that 
is not known to the speaker. 
(Group 7 and 8) 

So far the sentence-ending forms are all direct except questions. 
Groups (7) and (8) consist of indirect sentence-ending forms (ID).

 -Mitai (it looks like), -yoo (it appears to be) and -rashii (it seems) are 
the forms for inference (Group 7). (Da)tte (I heard), -soo (I heard), 
-to kiita (I heard), -to iwareta (I was told), -to iu hanashi (It is 
said), and others are all hearsay expressions (Group 8). 

(Group 9)

 Group (9) represents sentence-ending forms using epistemic 
auxiliaries of necessity and possibility (cf. chapter two). 
Kamoshirenai (it might be), hazu (it must be), ni chigainai (it must 

164 


be ), and "conjecture daroo" (probably) are used to indicate the 
possibility that the proposition is true, in that the speaker makes 
subjective judgement based on some kind of evidence. As well as the 
evidentials of hearsay and inference, epistemic auxiliaries are instances 
of the combination of structural and lexical expressions of evidentiality; 
while group (1) - (6) ending forms are morphological expressions of 
evidentiality. Therefore, these auxiliaries are often followed by 
particles and other sentence-ending forms, either direct or indirect, to 
allow those suffixed forms to bear the final sentence modality. 
Therefore, only direct- and semi-direct-type endings of auxiliaries (e.g. 
hazu desu, hazu yone., and hazu deshoo.) are listed and investigated to 
see the speakers' use of these subjective items; auxiliary forms with 
indirect endings (e.g., hazu mitai) were included in the forms of 
indirect endings in Group (7). 

(Group 10) 

Group (10) is I think expressions including -to omou, -to 
kangaeru, -to rikaisuru, and others. As the existence of -to 
(quotation) before the expressions suggests, most of these expressions 
are usually used as matrix verbs in complex sentences. These forms are 
treated as indirect sentence endings although the expressions show the 
speaker's subjective judgment as same as Group (9) evidentials. To see 
how directly or indirectly the informants handle information through 
these subjective indirect expressions, these items were separated. 

165 


The occurrence of these sentence-ending evidential forms of ten 
groups were analyzed in relation with two factors: types of speech 
situation, and propositional content of the speech including the 
speakers' age and sex. In this research, I argue that the hearer is 
important in two distinct aspects: the hearer's knowledge about the 
speaker's proposition is crucial for the speaker's choice of evidentiality, 
and the hearer's social relationship to the speaker is also crucial for the 
speaker in order for him to use the evidentiality markings to show 
appropriate politeness. The hearer's knowledge of the speaker's 
proposition is considered as the distance of the proposition from the 
hearer and the speaker. Do they both know the proposition very well? 
Is it public information? Is it the speaker's personal matter that he can 
commit himself to? Is the speaker talking about the hearer's matter? 
and so forth. The speaker may employ evidentiality expressions of 
different degrees of certainty in each situation considering the hearer's 
psychological distance from what he is presenting. Therefore, it is 
necessary to classify propositional context for the purpose of analysis. 

Proposition types 

At the first stage of the analysis, the occurrence of the forms were 
analyzed in relation with the types of propositions, i.e., to what degree 
the speaker commits himself to the proposition's truth value. My 

166 


grouping of propositions of sentences is largely based on the concept of 
information territory of the speaker and the hearer. I grouped all 
propositions into basic six different groups: 

[4-20] Proposition types for direct and indirect evidential forms 
Proposition for direct evidentials 
(A)information that is in the speaker's information territory, 
that the speaker assumes the hearer does not know 

(B)information that is in the speaker's information territory, 
that the speaker assumes the hearer knows 
(C) information that is in the speaker's information territory, 
that the speaker assumes also falls into the hearer's territory 
Proposition for indirect evidentials 

(D) information that is in the hearer's information territory, that 
the speaker does not know 
(E) information that is in the hearer's information territory, that 
the speaker knows 
(F) information out of both speaker's and hearer's territory 
(G) public information 
(G) type propositions were included in the category of (F) type 
information at the beginning of the research, but were later separated 
for experimental purposes. (A) to (F) are the basic six propositional 
types in this research. 
This stratification of proposition types is based on empirical and 
theoretical analysis of the data. In my 1993 study, I looked into discourse 

167 


data and confirmed that Japanese informants had unconsciously 
conformed with the rules of information territory and used different 
sentence-ending forms as suggested by Kamio (1987, 1990). At that time, 
as noted earlier in chapter two and three Kamio's early model has only 
four cases of interaction of information territories as [4-21] show: 

[4-21] Kamio's original concept of four information territories for a 
speaker 

Inside the hearer's 
territory 
Outside the hearer's 
territory 
Inside the speaker's 
territory 
TERRITORY A 
(information belongs 
to both speaker's and 
hearer's territories) 
direct+ne form 
TERRITORY B 
(information belongs 
only to the speaker's 
territory) 
direct form 
O u t s i d e t h e 
speaker's territory 
TERRITORY C 
(information belongs 
only to the hearer's 
territory) 
indirect+ne form 
TERRITORY D 
(information is out of 
both speaker and 
hearer's territories) 
indirect form 
In Kamio's earlier model, each territory was assigned a single 
surface sentence-ending form as shown in [4-21]. Such an anlysis was 
confirmed in my 1993 and 1994 studies that the Kamio's model basically 
reflects reality, but there were findings which did not agree with this 
theory. The major disagreements and additions were as follows: 
[4-22] 

(1) For territory (A) information, not only the form "direct + ne" 
168 


was used as expected by Kamio, but also deshoo (tag-question), and -janai 
(negative tag-question) and other related forms were used by the 
informants. 

(2) For territory (B) information, which Kamio claimed is the 
only case in which the simple direct form is possible, male informants 
used simple direct forms generally as expected while female informants 
used direct forms with sentence-final particles such as ne (information 
sharing), yo (informing), and n-desu (explaining). These are 
addressee-oriented particles; therefore, it was suggested that the female 
speakers may have greater consciousness of the presence of hearers. 
(3) For territory (C) information, for which Kamio assumed 
indirect forms with ne form are appropriate, questioning forms and 
janaino. (negative tag question + questioning), were used by the 
informants instead of "indirect + ne" forms. It was also noted that this 
janai was different from the ones for territory (A) information; the use 
of janai for territory (C) information was observed with rising 
intonation. 
(4) For territory (D) information, for which indirect forms with 
ne were expected, informants used simple indirect forms and question 
forms rather than the expected indirect plus ne forms. 
(5) Analysis of family discourse showed that more direct forms 
were used among family members regardless of information territories. 
(6) Data from formal interview discourse suggested that, in formal 
situations, speakers unanimously did not use simple direct forms at all in 
talking about information that belongs to their own territory; ne 
related forms were preferred. 
169 


(7) Kamio assumed that English speakers have a different concept 
of information territory; he argued that in English there are only two 
information territories, the speaker's territory and others; that is to say, 
English speakers do not care about the information territory of the 
hearers. However, my data suggested that English speakers also have a 
concept of hearer's territory and shared status of information between 
the speaker and the hearer. For territory (A) information, native 
English speaking informants used indirect forms in more than 62% of 
utterances, and for territories A and (B) information for which Kamio 
expected only direct sentence forms would be used by English speakers, 
some kinds of indirect forms were used in more than 70% of the 
utterances analyzed. Therefore, basically, English and Japanese may 
have a similar concept of information territory. 
(8) However, English speakers treat "public information" as 
everybody's information and used direct mode. This was a significant 
difference between the two cultures. 
The results of these earlier studies suggest the possibility of 

different concepts of information territories between males and 

females, in-group members and out-group members. The studies also 

showed that the relationship of the propositional content and sentence-

ending forms is not as simplistic as Kamio expected, suggesting that 

more finely sectioned information territories may exist in the Japanese 

speaker's mind. 

In these pilot studies, I analyzed data based on Kamio's 

categorization of information territory (four territories), and I now 

think the method that I used could be misleading; in doing analysis, the 

170 


possibility of the existence of other territories, or other types of 
interactions between the speaker's and the hearer's knowledge could 
have been ignored. In fact, as introduced in chapter three, in his 1994 
revisional paper, Kamio proposed eight cases in which the speaker's and 
the hearer's information territories are differently interrelated. He 
added two more surface sentence-ending forms which represent a new 
concept of speaker/hearer territory interaction with daroo forms. The 
usage of daroo is actually found in my 1993 study, but I plainly 
concluded they are an extension of direct forms since the study was 
centered on Kamio's framework and I did not clearly see the implication 
of the use of daroo (tag question/negative question) by my informants. 
Based on this retrospective thought, for this dissertation, I desired not to 
limit my analysis within existing frameworks laid out by either Kamio 
or other evidentiality studies, but at the same time, it is hardly practical 
to analyze the relationship between the sentence forms and the 
evidential context of the speaker's proposition without some framework 
which provides a way to "sort out" propositions into different categories. 

Thus this time, I first went through one part of the data, and 
examined the relevance of Kamio's newer version framework (1994) to 
the data. I have gained some results through this process, and 
constructed my original model, and examined more data which resulted 
in more modifications of the model. I repeated this process a few times, 
and finally reached my final model. I believe that this method worked 
better than an approach in which the framework of an evidentiality 

171 


system is first decided on and next the forms of evidentials are sorted 
from the data. Therefore, an attempt was made to examine the data 
without the restriction of existing theoretically hypothesized 
frameworks. In this sense, again, the method of analysis and the data 
analysis itself are interwoven at the first stage of this research. The 
more detailed process that has led to the above categories of propositions 
[4-20] is explained in the next chapter. 

An analytical problem 

The crucial analytical problem, however, is how judgement of the 
propositional type of an utterances is correctly performed. This is not 
an interpretation problem of the speaker's "meanings", but a problem of 
judging how much the speaker should assume the proposition is 
known/shared by his interlocutors. The speaker's proposition (or 
information) types are categorized upon the assumed status of 
informational content of the proposition in the speaker's, the hearer's, 
or both parties' information territory or knowledge. In order to 
precisely determine how a given proposition is identified among 
conversationalists, it is necessary to know the nature of the proposition 
and how much each participant is supposed to know about the 
proposition. This is not very difficult if one is in the discussion and able 
to observe the reaction of the hearer to an utterance and the subsequent 
reaction of the speaker to the hearer's reaction. However, since I can 
not represent every informant's memory, sometimes, the judgement is 

172 


difficult. Oishi (1985), who investigated Japanese final particles based 

on the theory of "linguistic particularity" (Pike, 1982; Becker, 1979), 

argued that an analyst's memory is unreliable: 

In understanding what was meant by a participant's utterance, an 
analyst relies on nothing but his own unique set of remembered 
prior texts without having direct access to the participant's set. 
In investigating how this utterance was interpreted by other 
participants in the conversation, the analyst again has to use his 
own set of remembered prior texts, which of course is different 
from that of the participants. As has been noted, one of the 
difficulties in the study of conversation lies in the fact that 
participants' assumptions are not immediately accessible to an 
analyst. These assumptions seem to be formed and stored in 
people's memory through their language uses in the past. We will 
see in our data that even between fairly new acquaintances, in 
the course of conversation, each participant's unique set of 
remembered prior texts is adjusted to the other's set, and common 
assumptions are formed through negotiations. In other words, it 

is a shared language activity that eventually forms such an 
assumption. In the relationship between an analyst and the 
participants, however, these processes of forming common 
assumptions are not logically available because an analyst 

typically does not share the conversation with the participants, 
and therefore lacks the shared memory of language uses with 
them. (1985: 19-20) 

Due to the memory barrier, Oishi said, correctly I think, that the 
actuality of conversation (i.e., text) is "distant" to the analyst and even to 
the participants. To minimize the effect of memory barrier, an 

"appropriation of text"12 was suggested by Oishi following Recouer 

(1981) and Becker (1977); however, the suggestion is not practical for 
this particular study. Since I desired to find general tendencies within 
my informants' use of evidentiality expressions, I looked into fairly 

173 


large discourse data provided by about 60 informants (besdies students), 
about 20 of which are from public discourse. Therefore, it was difficult 
to go back to each informant to discuss the data, although review 
discussions were held with several of the informants concerning the 
type of proposition and the particular forms of sentence-ending. Some 
discussions were useful while others were not. However, since I was a 
participant, I shared the common assumptions formed in our temporal 
memories with other participants for many discourses. In this sense, I 
was less helpless than a simple observer-analyst. To make the analysis 
consistent, after analyzing a few discourse excerpts, I formulated some 
rules of analysis which I felt necessary in order to minimize my 
subjective interpretation of the speaker's proposition types. Although 
the possibility of subjective analysis is unavoidable, an effort was made 
to mitigate the influence. 

Rules of analysis 

Sometimes, it was difficult to properly categorize the nature of a 
speaker's proposition within the milieu of the seven different 
information types of [4-20]. For example, public information (i.e., type 
G) can often be information out of the speaker's territory (i.e., type F) as 
well as mutually known information if it is experienced in some way by 
both parties (i.e., type C). In order to make consistent analysis, I 
formulated the following rules: 

Rule (1): If the type of a given proposition is ambiguous, for 

174 


example, ambiguous between (B) and (C), the utterance will be ignored 
in the analysis. 

Rule (2): Hedges, conventional greetings, and conventional set-
phrases which do not represent their literal meanings will be excluded 
from the analysis. 

Rule (3): Incomplete sentences which do not include sentence 
final modality, and sentences with deontic modality (i.e., modality 
concerning permission, prohibition, and obligation) will be excluded 
from the analysis.

 Rule (4): Information which is out of both parties' information 
territory and is well-known to most of the community members 
including the discourse participants and which is known to be known 
will be categorized as "public information" (G), while information that 
does not fall into either party's information territory and which is 
known by some or all participants will be treated as (F) type 
information. 

Regarding Rule (4), the informants showed that they distinguish 
between these two types of public information: (G) and (F). Often a 
speaker tried to confirm his hearer's knowledge about the public 
knowledge that he is presenting in order to decide on the mode of the 
proposition. The next discourse sample is an example: 
(4-23) 
F12 (1): a, igirisu, london ni sundeta toki ni 

Well England london LOC lived time TEMP 

kanojo ga koten o yattete, 

she NOM exhibition OBJ did(STAT) 

sono hanashi, shitteru desho? 

175 


that story know AUX(CONF) 

F5 (2)
: 
shiranai 
don't know 

F12 (3): aa, honto? Ja, koten o yatteta -n datte. 
Well, really Then exhibition did(STAT)-n hearsay 

F12 (1): Well, in England, when they lived in London, she was 

holding an exhibition of her own, you know the story, 

don't you? 

F5 (2): No, I don't know. 

F12 (3): Well, then, it is said that she was holding an exhibition. 

In the above conversation in which speaker F12 was talking 
about Yoko Ono, a famous public figure, she assumed that the hearer 
knew the famous episode of the first meeting of Yoko Ono and John 
Lennon, so she presented the proposition in a direct mode in (1) 
suggesting she was treating the proposition as public truth (i.e., a G-
type proposition). But after checking the hearer's knowledge by (2), 
F12 realized F5 does not know the proposition, then she switched her 
mode into the hearsay mode (i.e., F-type) in (3). However, not every 
speaker is this sensitive to the hearer's knowledge about public issues. 
In that case, the speaker possibly uses only direct mode to present 
public information which possibly gives the hearer the impression that 
the speaker is treating the proposition that is out of his territory as if it 
is in his territory. The next discourse is an example of this type of 
interaction: 
(4-24) 

176 


F2 (1): kawaisooda yo ne. 
miserable PART(VOC) PART(RAPP) 
(2): sorede kodomo ga futari mo dekichatte. 
then children NOM two as many as born(regret) 
(3): sorede rikon o shinai yooni san-nin me, 
then divorce ACC do(NEG) in such a way third one 
tsukuritai-tte itta kedo Chaaruzu, moo iranai. 

have(DES)-COMP said but Charles any more desire(NEG) 

(4): sokode moo hitori kodomo o tsukutte-oke-ba 
then more one child ACC have(te)-(RES)-(COND) 

warui kekka ni naranai-n-janaika-tte 
bad result DAT become(NEG)-n-(NEG)-Q-(COMP) 

iunde itta-n-dakedo, Chaaruzu ga kobanda no yo. 
so said-n-but Charles NOM rejected PART(VOC) (VOC) 

Others (5): sugoooi, yoku shitteru nee# 
great well know PART(SHARE) 

F2 
(1): [Diana is] so miserable, isn't she? 
(2): Then they had two children. 
(3): So in order to prevent divorce, [Diana] said [to Charles] she 

wanted a third one, but Charles[said] he did not want 
anymore. 

(4): [Diana] said so because [she thought] they can avoid bad 
ending if they had the third child, but Charles rejected the 
idea (I tell you). 

Others (5): Wow... you know very well, don't you? 

In this conversation, speaker F2 was talking about the collapse of 

Princess Diana and Prince Charles's relationship, and since she used the 

direct mode (as underlined), the hearers (four of them) unanimously 

reacted to pretend they were impressed by speaker F2's knowledge. 

177 


However, as the proposition is someone else's very private matter which 
can hardly be in speaker F2's information territory, others' reaction 
can be understood as critical. A proposition of this type is usually 
treated as a (F) type proposition and spoken with hearsay mode. 

Rule (5): If a given proposition that is public happened to fall in 
the speaker's or the hearer's, or both parties' information territory, 
personal territory will be considered to have the primary status. 

Rule (6): Common sense knowledge which almost everybody 
agrees to will be considered to be known by "experience" so it falls into 
proposition type (C). 

Following the rules above, all applicable propositions were sorted 
into (A), (B), (C), (D), (E), and (F) as proposed, and two other additional 

types (G) (public information) and (H) self-talk for experimental 
purposes (see the next chapter), and within each propositional 
category, the occurrence of sentence-ending evidential forms was 
monitored. 

The process for creating the database for quantitative analysis is 
illustrated in the following chart, [4-25]. 

178



[4-25] Database for quantitative analysis 
(1) Data collection (recording) 
(2) Transcription 
(3) Data input 
(3-1) Informant data (SITUATIONAL CONTEXT) 
Code name (e.g, F1, M2)--------------------> 
Age --------------------------------------> 
Gender--------------------------------------> 
(3-2) For each sentence-ending form with clear epistemic 
modality: 
(a)Informant's code--------------------------------------> 
(b)Evidential form information: 
form of sentence ending-------------------> 
plain/polite distinction -------------------> 
group type of the form -------------------> 
Group (1) - Group (10) 
D 
A 
T 
(c) Discourse type (SITUATIONAL CONTEXT)---------> A 
1) discussion with high formality 
2) court interaction 
(prosecutor/defendant) 
3) public talk 
4) conversation with low formality 
with friends 
5) conversation with low formality 
with family members 
6) teacher-student interaction at school 
(teacher/student) 
B 
A 
S 
E 
(d) Proposition type (PROPOSITIONAL CONTEXT)-----> 
(A) ~ (H) 

179



CHAPTER 4: NOTES 

1Although I did not ask for information regarding social class, I 
assume the informants would claim that they are middle-class city-
dwellers since most of Japanese people claim to be middle-class. All of 
the informants happened to be office workers (presently or retired) or 
house wives. But this may not be applicable to all of the student 
informants in schools I visited. 

2 A brief account of each case is given below, which may help 
readers understand the transcribed speeches used in this dissertation. 

Yakugai-AIDS case (case of medical products tainted with AIDS 
virus): In 1996, it was revealed that twelve years earlier, the Japanese 
Ministry of Health (MOH) had delayed the termination of the use of 
possibly AIDS-tainted blood products (ketsueki-seizai) imported from 
the U.S.A. for hemophiliac patients. This happened before the Japanese 
people became familiar with the disease. Teikyo University found that 
more than twenty of their hemophiliac patients were HIV positive yet 
the university continued to use the blood products with the excuse that 
they were not sure if the patients were really infected by AIDS virus. 
MOH and its affiliated AIDS research committee led by a Teikyo 
University doctor were suspected of trying to delay the recognition of 
the first AIDS patient in Japan. It was suspected that this delay was due 
to the relationship between the ministry (MOH) and the manufacturer 
of the blood product, Midori-juji (Green Cross), a pharmaceutical 
company, run by officials retired from MOH. For more than ten years, 
the existence of hundreds of AIDS patients who had became infected by 
this blood product was not well known by the public. Finally in 1996, 
one young man who is a victim of the case requested public attention, 
and the newly assigned minister of MOH, who carried out the 
investigation, disclosed details of the misconduct to the public. This case 

180 


revealed two problems with Japanese society: problematic cohesion of 
government and industry which works contrary to the benefit of the 
public, and the secretive nature of Japanese governmental activities. 

Aum-shinrikyo case (case of Aum cult): 

A cult, led by Asahara Shokoo, who claimed to be "God", attempted to 
seize Japanese Governmental functions. Interestingly, Asahara had a lot 
of intellectual and successful followers who supported him financially 
and technically. They invented weapons (conventional and biological) 
and other materials to occupy the country physically and killed those 
who tried to escape from the cult or who were about to find out what the 
cult was attempting. They surfaced for the first time when seizure of 
the governmental body at the Kasumigaseki area was attempted by 
strewing Sarin poison gas in the area. Several core members were 
involved, and even after Asahara himself was finally arrested, some of 
them were still at large. Since further attempts to physically seize the 
governmental control were feared, the police carried out one of the 
most extensive searches the country had ever seen. 

3 For example, the following chart demonstrates the relationship 
of the group membership of the parties involved with the selection of 
the verb "to be": 

[4-26] Different "to be" verbs depending on listener and referent 

listener referent verbs used by the speaker 
in-group 
out-group 
in-group 
out-group 
in-group 
in-group 
out-group 
out-group 
iru 
oru 
irassharu 
irassharu 
181



4 The system of Japanese honorifics has been considered to have 
two axes: the speaker-addressee axis ("performative" honorifics) and the 
speaker-referent axis ("propositional" honorifics) (e.g. Harada, 1976, 
Shibatani, 1990). 

"Addressee-oriented" honorifics are said to be wide-spread 
throughout the world. The use of French vous and German sie is an 
example (Shibatani, 1990:375). Addressee-oriented honorifics do not 
require the presence of "socially superior to the speaker" in the 
propositional content of the sentence (Harada, 1976:502). Japanese polite 
sentence ending (i.e., desu/masu) forms fall in the category of this 
performative honorifics. For example, the following three sentences 
in (4-26) have the same referential meaning, "this is a book", but (a) is 
used to familiar, or equal status addressees in casual speech settings, 
while (b) is used to someone who is socially distant or higher. (b) is also 
used among equals or to lower-status addressees in formal settings with 
bystanders. (c) is used to an addressee who is significantly superior 
than the speaker, or to anybody in a very formal environment. 

(4-27) 

(a) 
Kore wa hon da. 
this TOP book COP 
(b) 
Kore wa hon desu. 
this TOP book COP(FOR) 
(c) 
kore wa hon degozaimasu. 
this TOP book COP(hyperpolite) 
"Referent" honorifics (or propositional honorifics) includes the 
target of honorific use in the subject position of the sentence ("subject 
honorifics") or the object position of the sentence ("object honorifics"). 

182 


Each "performative (addressee)" and "propositional (referent)" 
honorific usage has three different levels of formality: "plain", "polite", 
and "hyper-polite" as shown in the above sentences (a), (b), and (c). 
The axis of performative honorifics and the axis of propositional 
honorifics are independent from each other except when the subject or 
the object of a sentence coincides with the addressee or the speaker. 
Therefore, theoretically six different formality levels are possible. The 
following sentences (d) to (f') have the same referential meaning, the 
teacher laughed. Among them, (d) is a plain sentence without either 
propositional or performative honorifics. Sentences (e), (e'), (f) and 
(f') are examples of propositional (i.e., referent) honorifics in that the 
target of the honorific is sensee (teacher). Combination of the 
nominalized verbal form warai ni (to laugh) with the honorific prefix 
o- and adverbial complement of the verb naru (become) indicates a 
form of referent honorific. Sentences (d) and (d') are with the plain 
level, (e) and (e') are with the polite level, and (f) and (f') are with the 
super-polite level. In terms of performative (addressee) honorifics, (d), 
(e), and (f) are in plain form while (d'), (e'), and (f') are in polite form: 

(4-28) 

(d) 
sensee ga warat-ta. ---plain 
teacher NOM laugh-(PAST) 
(d') sensee ga warai mashita. ---polite (addressee honorifics) 
teacher NOM laugh-(FOR)(PAST) plain (referent honorifics) 

(e) sensee ga o-warai ni nat-ta. 
teacher 
NOM HON-laugh HON-(PAST)---plain (addresee honorifics) 
polite (referent honorifics) 

(e') sensee ga o-warai ni nari-masita. 
teacher NOM HON- laugh HON-(FOR)(PAST) 
--- polite (addresee honorifics) 
polite (referent honorifics) 

183 


(f) 
sensee ga o-warai ni narare-ta. 
teacher NOM HON- laugh HONhyperpolite-(PAST) 
--- plain (addressee honorifics) 
hyper-polite (referent honorifics) 

(f') sensee ga o-warai ni narare-mashita. 
teacher NOM HON- laugh HONhyperpolite-(FOR)(PAST) 

--- polite (addressee honorifics) 
hyper-polite (referent honorifics) 

[(d) and (e) are from Shibatani, 1990: 376] 

Performative honorifics are shown in addressee-oriented 
sentence-ending forms so they are directly related with the issue of this 
dissertation. In the use of performative honorifics, the plain form level 
(da, -ta, etc.) is perfectly acceptable for communication among people 
who share a close relationship such as family, friends, colleagues of 
similar age, without any implied disrespect. The plain form may also be 
used by a speaker in a superior position in informal situations to 
inferior-status addressees with no connotation of rudeness. The form is 
not suitable for any kind of formal setting such as meetings or speeches. 

Polite forms of performative honorifics (desu, -masu, etc.) are 
used among strangers and distant acquaintances indicating social 
distance, and are also used by lower-status speakers to higher-status 
hearers in the same group (family, company, school, etc.) showing 
casual respect from status differences. Polite forms are as commonly 
used as the plain forms. 

The use of hyper-honorifics is limited to formal speech settings. 
This form of honorifics uses a different lexicon (e.g. to eat is 
meshiagaru in super-polite form vs. taberu in plain form), or is 
indicated by an honorific suffix or prefix. There are usually three 
different types of hyper-polite meanings: humble, exalted, and neutral. 

184 


5 Considering the long history of Japan, the Japanese language 
has been standardized only fairly recently. It was started in 1869 at the 
time of the Meiji-restoration. Japanese people were historically 
"confined" to their birth prefecture that was governed by a Daimyo (lit. 
big samurai), without the freedom to leave that prefecture. This policy 
was maintained for a long time in order to keep farmers "tied" to the 
land to secure the tax income of each Daimyo. Therefore, there was no 
communication among the sixty odd local prefectures. This restriction 
enhanced the development of local dialects. It is reported that during 
the middle of the Edo-era (i.e. seventeeth century) people were unable 
to communicate outside of their own prefecture. In addition to local 
dialects, "class" dialects developed; people in different social classes (e.g. 
monks, soldiers, general public, women) spoke different "languages". 
Further, each class used different written and spoken languages. 
Overall, before language standardization, there were diverse versions of 
the Japanese language. Then, after the political unification of all 
prefectures was achieved to establish the nation of Japan as a whole, it 
was realized that language standardization was urgently needed for 
"communication convenience" and also for "national unity". This 
necessity was heightened by the contingency of wars. Language 
planning started with the collection of data from local dialects to select 
one standard dialect. The national committee in charge decided to select 
the Tokyo dialect, and prescribed grammar details including 
phonological expressions. Written and spoken languages were unified 
in the standard language. Implementation of the standard Japanese was 
successfully performed through school education. Rapid development 
of mass-communication such as TV and radio also helped the 
implementation to a great extent. Mass-communication has also 
contributed to shape the standard language to the current form. (e.g. 
Kamei, et al., 1965a, b; Matsumura, 1986; Mashita, 1953; Sanada,1983; 
Sato; 1982) 

185 


6Sanada's quantitative research (1983) in every prefecture in 
Japan on the range of standardized forms of selected words indicated 
that Tokyo dwellers scored 61.1% on average. Although it is not as high 
as a non-dialectologist may expect, the score was the highest among 
forty-eight prefectures. The Kanto-area prefectures (surrounding 
Tokyo) were all ranked high: Saitama, 60.8%, Tochigi, 60.7%, Kanagawa, 
59.4%, Gunma, 57,7%. Although Hokkaido, the northmost island is 
ranked next (53.8%), generally, the farther away from Tokyo a 
prefecture is located, the lower its score was. The southern islands, 
Okinawa (3.3%) and also prefectures in Kyushu island (25-31%) scored 
low as well as northern Honshu prefectures (21-27%). 

7Dialectal differences entail a variety of linguistic features: 
therefore, it is difficult to articulate how many regional dialects are 
spoken in Japan. Dialect maps are drawn to show regional differences 
in each single feature: phonemes, accent, tone, lexicon, semantic 
categories, and a number of grammar aspects (e.g. conjugated forms of 
verbs and adjectives, nominal-adjectives, noun-compounds, particles, 
honorifics) and others. It has been generally understood that dialectal 
divisions based on different linguistic features with different dialect 
boundaries. However, Kindaichi (1977) described general divisions 
among dialects that support phonological, grammar, and accentual 
differences among dialects. In Kindaichi's general dialectal map, there 
are three principle dialect groups: Nairin-dialect, Churin-dialect, and 
Gairin-dialect, and each group is further divided into twenty-five sub

divisions. 
[A] Nairin-dialect ------------------------------------------(5 
1. Standard Ko-type dialect 
2. Tosa dialect 
sub dialects) 
3. Western Kagawa prefecture dialect 
4. Eastern Kagawa prefecture dialect 
5. Southern Noto dialect 
[B] Churin-dialect------------------------------------------(10 sub dialects) 

186



(a) Standard Otsu-type dialect 
1. Eastern Japan Churin dialect (Tokyo, Kanagawa, etc,) 
2. Western Japan Churin dialect 
(i) Noobi dialect 
(ii) Totsugawa dialect 
(iii) Chugoku dialect 
(iv) Shikoku Inan area dialect 
(v) Northeast Kyushu dialect 
(b) Quasi-Ko-type dialect 
1. Hokuriku dialect 
2. Sekiho, Nagahama dialect 
3. Kumanonada dialect 
4. Shikoku Uwa area dialect 
[C] Gairin-dialect------------------------------------------(10 sub dialects) 
(a) Eastern Japan Gairin dialect 
1. North Oh-u, Hokkaido dialect 
2. South Oh-u, Northern Kanto dialect 
(b) Hachijoo-jima dialect 
(c) Ooigawa, Yamanashi-Narada dialect 
(d) Northwest Noto dialect 
(e) Izumo, Oki dialect 
(f) Kyushu dialect 
1. Chikuzen, Iki, Tsushima dialect 
2. Miyazaki dialect 
3. Northwest Kyushu dialect 
4. Satsuma, Goshima dialect 
8Sentence-ending forms are described in standard Japanese. The 

data contains limited numbers of dialectal forms (most of them are from 

the Kansai area); they are "assimilated" in description into the standard 

forms in quantitative analysis. The following are dialectal forms 

included in the list: 

(local dialect) (standard forms) (meanings of 
expression) 
(example) 

V ta form n-ka na ------------>Vta form no ka na I wonder (self-talk) 

(e.g. atta-n-ka na) (e.g. atta no ka na . ) (I wonder there was..) 
-n kedo -------------------> -nai kedo direct negative 

(e.g. shira-n kedo) (e.g. shira-nai kedo) (I do not know) 
187



-toru -------------------> -Vte form + iru 
stative 

(e.g. ittoru) (e.g. itteiru) (they are saying) 
-totta -------------------> Vte form ita somebody said... 
(e.g.ittota) (e.g. itteta) ( Somebody said so) 
ya -------------------> da, yo direct vocative 

(e.g. soo ya kedo) (e.g. soo da kedo) (It is so, I am telling you) 
(e.g. 
kita-n ya) (e.g. kita no yo) ( Someone came, I am 
telling you.) 
-yate -------------------> -datte hearsay 
-to chigau? -----------------> -janai? tag-Q, 
negative question 
-yaro. -------------------> -deshoo. tag-Q 
-henya-n-ka ----------------> -hen janai? Isn't it strange? 
-nen -------------------> -n da Explanation 

(e.g. 
kireru nen) (e.g.kireru n da) (This cuts, you 
understand) 
-yate -------------------> -datte 
hearsay 

(e.g. 
akan yate) (e.g.dame datte) 
(Someone said "No") 
9Oishi characterized ne with rising tone as indicating that 
information belongs to the speaker's territory. I suspect, however, this 
rising -ne in the data described by Oishi is the rising version of 
"rapport ne" in my analysis which simply sends an "I am talking, are 
you listening?" message to the hearer. Oishi found this ne (in his data) 
from a single speaker, therefore, the high pitch of rapport -ne may be 
this individual's personal trait (actually there are some people who 
habitually do this). In my model, "rising -ne" (as well as rising -yone) 
involves both parties' knowledge. 

10 There are traditional ways to raise declarative sentence-
endings meaning questions. Actually this usage is very common, 

188 


especially in casual speech. However, these "traditional" rising endings 
in declarative sentences and "quasi-questions" in declarative sentences 
are different in tone. 

In quasi-questions, often the very last vowel of the sentence (or 
of a word) final syllable (cf. Japanese unit of sound is syllable) is 
prolonged and sharply raised. If a speaker asks a question by raising 
the end of declarative sentence (e.g. You are a UT student?), sentence-
ending is raised naturally and gradually in the sentence-final word. 
The quasi-question forms are used as a surface presentation of the 
speaker's willingness to solicit agreement from his hearer, so the form 
may result in superficial raising of the final vowel of the final syllable. 

11 But at least to me, the quasi-question strategy did not sound 

modest; it was rather annoying in that I felt as if I was bombard with 
tons of requests for agreement to which I was actually not asked to 
answer. Often, quasi-question forms are used for type (A) propositions, 
i.e., information which is exclusively known to the speaker which does 
not need to be agreed/confirmed by the hearer. 

12Oishi (1985:33) quoted Ricouer in order to explain the concept of 

"appropriation": 

If it is true that interpretation concerns essentially the power of 
the work to disclose a world, then the relation of the reader to the 
text is essentially his relation to the kind of world which the text 
presents. The theory of appropriation which will now be 
sketched follows from the displacement undergone by the whole 
problematic of interpretation: it will be less an intersubjective 
relation of mutual understanding than a relation of apprehension 
applied to the world conveyed by the work. A new theory of 
subjectivity follows from this relation. 

To understand is not to project oneself into the text; it is to receive 
an enlarged self from the apprehension of proposed worlds 
which are the genuine object of interpretation. Following 
Gadamer's analysis in Truth and Method, we shall introduce the 
theme of "play". This theme will serve to characterize the 

189 


metamorphosis which, in the work of art, is undergone not only 
by reality but also by the author (write, artist), and above all 
(since this is the point of our analysis) by the reader or the 
subject of appropriation. 

(Ricouer, 1981: 185) 

His explanation is rather abstract but in short, it seems Ricouer 
meant that through "play", the analyst realizes an "enlarged self" and 
"the actualization of meaning as addressed to someone" (1981: 185), and 
in this process, the reader (analyst) forgets himself and things he 
previously thought to be natural in language. Then, what should be 
done practically in appropriating the text? Oishi himself drew a three-
step-framework of his text data: the first step of appropriation followed 
by a description of the text by the analyst, the second step of 
appropriation followed by description of the text by the analyst and the 
participants, and the third step of appropriation by the analyst with the 
view integrated through the first and second steps. It was emphasized 

that the interview of the informants by the analyst provides an 
important appropriation of the text to approach the actuality of a 
conversation. 

190



CHAPTER 5: MODEL OF JAPANESE EVIDENTIALITY 
In this chapter, I will propose my model of the framework for 

Japanese evidentiality based on empirical data as well as the theories of 
the universal concept of evidentiality and the Japanese concept of 
information territory. 

THE CONCEPT OF INFORMATION TERRITORY AS BACKGROUND FOR THE 
MODEL 
Direct versus indirect evidentiality 

The Japanese evidentiality system model which I propose consists 
of two basic types of evidentials that are considered universal: "direct 
evidence" and "indirect evidence" as in Willett's model (cf. chapter two 
and appendix C). The principal difference between the universal 
concept of direct evidence and my model is that direct evidence in my 
model is not limited to that which the speaker has obtained through 
direct experience; it includes any information to which he has socially 
authorized primary access, i.e., information (or propositions) which 
belongs to the speaker's "information territory" (in Kamio's term). 
Information other than this is considered to be based on indirect 
evidence and expressed in structurally indirect forms such as hearsay 
evidentials and questions. This is the first corollary of the model: 

COROLLARY 1 (direct/indirect evidentials) : 
Direct evidentials express a speaker's proposition which falls in 
the speaker's information territory and to which the speaker has 

socially licensed primary access in each speech situation. 

191 


Indirect evidentials express a proposition which does not fall in 
the speaker's information territory. 

The speaker's and the hearer's information territory 

As assumed, the Japanese concept of evidentiality is very deeply 
related with the knowledge of the speaker and the hearer just like the 
Kogi language (Hansarling, 1984 by Palmer 1986 in chapter two). 
Furthermore, Japanese evidentiality is specifically related with the 
concept of information ownership, and is not a simple matter of 
"knowing" or "not-knowing". Therefore, as the initial task of this 
research, it was mandatory to come up with the most realistic model of 
the speaker's psychological information territory. 

In the process of reaching the final model of evidentiality 
through data analysis, I found the fundamental concepts in Kamio's 
model to be very useful. However, from the viewpoint of evidentiality, 
Kamio's theory does not fully reflect the reality of informants' use of 
evidentials, consequently, a new framework was necessary. 

In the model which I am proposing, a speaker's "knowledge" and 
the "information in his own territory" are treated distinctively 
different. In this sense, the condition of being classified as information 
belonging to the speaker's territory is the most essential corollary in 
the model. As explained in chapter three, Kamio provided three 

conditions for the speaker's territory information1 which I modified 

based on the results of data analysis as follows: 

192 


COROLLARY 2 (the speaker's information territory): 

A speaker's information territory contains the following four 
major types of information: 

(a) Information obtained through the speaker's past and current 
direct experience through visual, auditory, or other senses, 
including the speaker's inner feelings; 
(b)Information 
about people, facts, and things close to the 
speaker, including information about plans, actions, and 
behavior of the speaker or other people whom the speaker 
considers to be close, and information of places with which 
the speaker has a geographical relation; 
(c) Information 
embodying detailed knowledge which falls 
within the speaker's area of expertise (professional or 
otherwise). 
(d) Information which is unchallengeable by the hearer due to 
its historically and socially qualified status as truth. 
The above corollary suggests that even if a speaker has some 
knowledge about his proposition if the proposition does not meet at least 
one of these four qualifications, the proposition does not belong to his 
territory; it is knowledge out of his territory.

 These days an individual is destined to be exposed to huge amount 
of information from various sources. Actually one's daily life is often 
based on dealing with information, i.e., getting, producing, 
transferring, evaluating, and manipulating information. Among the 
assorted information sources, the most reliable one is, naturally, a 
speaker's direct experience. The information from direct experience is 

193 


only small fraction of the entire information which a speaker 
linguistically expresses in direct forms [i.e., condition (a) in Corollary 
two]. Target information for direct evidentials involves certain types of 
information besides direct experience as (b), (c), and (d) of Corollary 
two qualify. This kind of information, theoretically and also empirically 
speaking, motivates a speaker to be linguistically direct. Examples of 
the information defined as speaker's information by Corollary two are 
shown as follows: 

(a) 
Information obtained through the speaker's past and 
current direct experience through visual, auditory, or 
other senses, including the speaker's inner feelings; 
(5-1)
F26: amerika made dono kurai jikan kakatta ka oboeteru 
.


USA till how long time took COMP remember(STAT)? 

S2: 
wasureta. neta yo. 
forgot slept VOC 

F26: Do you remember how long it took to go to America? 

S2: I forgot. I slept. 

The information, "I forgot" and "I slept", is based on the speaker's 
direct experience and most genuinely belongs to the speaker's 
information territory. Both sentences by speaker S2 are direct 
sentences with direct endings in Japanese. These kind of propositions 
are sufficiently straightforward as not to require further examples. 

(b) Information 
about people, facts, and things close to the 
speaker, including information about plans, actions, and 
behavior of the speaker or other people whom the 
194 


speaker considers to be close, and information of places 
with which the speaker has a geographical relation; 
Following two statements are from fathers referring to their sons. 
Both fathers treat their sons' information as their own as they consider 

that their sons and matters related to them to be close to themselves:
(5-2)
M12: ano ima borantia undoo o iroiro yatteru


well now volunteer activities OBJ various doing(STAT) 

mon desu kara ne 

COMP COP(FOR) ABL PART(RAPP)
M12: [my son] is now doing all sorts of volunteer work.


(5-3) 
M1: kare, jibun no shumi de atsumeteru hon ga ne, 
he himself POSS hobby collecting(STAT) books NOM RAPP 

eikoku ni ooi kara ne. militarii bukku. 
England LOC many because PART(RAPP) military books 

kore wa nee, mukoo iku-to monosugoi ookina
this TOP RAPP overthere go-COND tremendously large


boodaina korekushon ga aru-n-da.
huge collection NOM exist-n-COP


M1: He[=my son], the book he collects as hobby are abundant in 
England. Military books. This is, when you go to England, 
they have a huge collection of this kind of books. 

In the following statement, M1 and F18 talked about the current 
anti-British trend in Australia. Although the speakers are Japanese, 
they lived in Australia for a long time and even after returning to 
Japan, they routinely visit Australia every year. All the attendants 
knew their close relationship with Australia. Therefore, the speakers 

195 


are considered to be entitled to speak about the country as their close
information. This is an example of a direct evidential of close
"geographical relationship"
.
(5-4)
M1 (1) : de ne., ano daiana nanka no ikken ne.
.


then RAPP that Diana et al. MODI incident RAPP
(2): eikoku no ooshitsu ni taisuru ishin ga 
England POSS crown DAT toward dignity NOM


masu masu sagatte-kita.
more and more decrease(te form)-came.


(3)
: 
kanari oosutoraria no hoshutoo ano
very much Australia POSS conservative party that 


hoshutekina ootouha ga ne., konogoro osaregimi. 
conservative Tories NOM RAPP these days drop-off 

F18(4)
: 
eikokukei ga honto sukunaku natta. 
British people NOM really became few 

M1 (1): Well, that affair of Diana and the spouse. 
(2): British royal family is losing prestige [with Australian 
people] increasingly. 
(3): Seriously, Australian conservative party, that 
conservative royalist faction is recently declining. 
F18 (4): People of British origin have become fewer indeed. 

(c) 
Information embodying detailed knowledge which falls 
within the speaker's area of expertise (professional or 
otherwise). 
In the following speech, M15 is talking about multi-media, 
especially cyber-space and its future, He is a professor of a related field 
so that his knowledge can be explained with direct evidentials although 

196 


he must have gained knowledge through indirect channels: 

(5-5) 
M15 (1): sukunaku tomo ima no intaanetto no yoona 
at least current Internet MODI like 

bunsantekina joohoo sisutem de iimasu-to 

dispersed information system INS say-COND 

hijooni ookuno hito ga jibun no hoomu-peegi 

very much many people NOM oneself POSS home-page 

no yoo na mono o motte, jibun no sakuhin o 
MODI like thing OBJ have(te form) oneself POSS creation OBJ 

oitari dekiru wake desu ne. 

put able COP(FOR) PART(RAPP) 

(2):sooshite goku kagirareta hito shika sore o 
then very limited people only that OBJ 

mi-ni-konai 

look-in order to-come(NEG) 

(3): sonokawari sono hito ni kannshinn o motta 
instead that person DAT interests OBJ had 

hito no tame hijyooni fukai mono o yooishite 

person POSS benefit very deep context OBJ prepare(te-from) 

oku-tte koto ga yariyasuku naru-n desu. 
prepare-QUOT COMP OBJ easy to do become-n COP(FOR) 

(4): syoosuu no masu-media dake desu-to 
a few MODI mass-media only COP(FOR)-COND 

soo wa ikanai-n-desu ne.. 
soo TOP work(NEG)-n-COP(FOR) PART(RAPP) 

M15 (1): At least, if it is a dispersed type information system like 
current Internet, an extremely large population can 
have their own home-page, and display their creations 
in there. 

(2): Then, only limited number of people will come to see it. 

197 


(3): Instead, it will be easier for us to prepare enriched 
information base only for those who are interested in us. 
(4): If we rely on a few limited mass-media systems, it won't 
be like that. 

In the next example of "professional evidence", speakers M14 
and F17 spoke to the public in a TV news program. Although the 
proposition was not obtained through their direct experience, the 
speakers transferred their messages as truth as required as professional 
reporters. In this sense, showing high commitment to the proposition is 
part of their professional "register". I interpret these as the cases of 
professional knowledge. 

(5-6) 

M14: 
shijoo saiaku no kibo de shokuchuudoku no kibo 
history worst MODI size INS food-poisoning MODI scale 

ga sara ni hirogatte orimasu.
NOM further spreading(te-form) COP(FOR)


M14: Victims of the food-poisoning which is spreading at a the 
national-record are further increasing in number. 

(5-7) 

F17: 
Taifuu ga mottomo sekkin-suru no wa asagata 
typhoon NOM most approach COMP TOP dawn 

ni naru to iu koto desu. korekara ame ya kaze 

DAT become QUOT COMP COP(FOR) from now rain also wind 

mo kanari tsuyoku natte kimasu.
also very strong become(te) will become(FOR)


F17: 
It is announced that the typhoon will be closest to the 
islands around dawn. From now on, rain and wind will get 

198 


stronger. 

(d) Information which is unchallengeable by the hearer 
due to its historically and socially qualified status as 
truth. 

This type of direct evidence is similar to one of Givon's (1982) 
proposition types: "propositions which are to be taken for granted via 
force of diverse conventions as unchallengeable by the hearer and thus 
requiring no evidential justification by the speaker" (p.24). The 
proposition which suffices condition (d) is not the same with public 
information in that public information is known widely but not 
necessarily known to be true. (d) Type information is known to be true 
or agreed to be true. A historical fact is an example. Usually this type of 
information is common-sense knowledge so as to be described with a 
direct ending, often with shared-information evidentials. The next 
discourse involves a matter related with the Japanese governmental 
administrative system that satisfies condition (d) of Corollary two. 
(5-8) 

F5 (1): kondo shoohizei go paasento ni naru-n-datte. 

this time consumption tax 5 % to become-n-hearsay 
M4(2): soo soo.

 it is so 
F5(3): soo iu-no katteni kimete ii wake. 

such QUOT-COMP freely decide good 

aru janai nanka soo-iu no. juumintoohyoo 

exist isn't there something so-QUOT COMP referendum

 janakute. 
NEG(te-form) 
M4(4): juumin toohyoo. 
referendum 

199 


F5(5): katteni kimete ii 
freely decide 
wake. 
good 
M4(6): katte janai yo. 
selfish (NEG) PART(VOC) 
tejun o 
procedure OBJ 
funderutake(STAT) 
wake. 
(explain) 
toohyoo suru dankai de. 
voting do step TEMP 

F5(1): I heard that the consumption tax will be 5%. 
M4(2): It is so. 
F5(3): Can they decide it all by themselves? 

There is something like such and such (isn't there?) 

It isn't referendum.. 
M4(4): Referendum. 
F5(5): Can they decide it without it? 
M4(6): They did not decide it all by themselves.

 There was a process [to lead to the resolution]. 
At the time of voting. 

In M4(6), the speaker explained to F5 that the government had 

not ignored the "public will" in deciding to raise the consumption tax 

rate; they are members elected by the public and are supposed to 

represent the public. This proposition agrees with the well-known 

theoretical background of the political representative system of 

democracy, and is within the scope of common-sense information. 

Therefore, the argument that the government did not ignore the public 

in the matter of the consumption tax raise should be handled as logical 

truth. This logic in the speaker's mind appeared linguistically in his 

200 


direct sentences in (6) as "unchallengeable truth". 

The next example shows a different aspect of condition (d). 
Speaker F3 experienced the Hanshin Earthquake in 1995, which caused 
serious destruction in the city of Kobe, a town in Western Japan. 
Although Japan as a whole has frequent earthquakes and the residents 
are used to them, Kobe had never had such a serious one and people 
believed Kobe would never have such an earthquake. Since "Kobe is an 
earthquake-free city" was a kind of socially accepted truth (but 
probably not stratigraphically), speaker F3 treated this information as 
unchallengeable: 

(5-9) 
F3 (1): de 
then, 
kore wa jishin 
this TOP earthquake 
da 
COP COMP 
to wa 
TOP 
omotta-n-dakedo 
thought-n-but 
(2): keikenshita koto 
experience COMP 
ga nai shi 
OBJ NEG also 
(3): demo nannka kobe wa jishin ga nai-tte iwareteta 

but somewhat Kobe TOP earthquake NOM NEG-QUOT-said(STAT) 

kara 

ABL 

(4)
: 
watashi ga kite kara nankai ka atta-n-dakedo 
I NOM came since a few times happened-n-but 

(5)
: 
sonnani ookii jishin ga kuru to wa 
such big earthquake NOM come COMP TOP 

yume ni mo omowanai janai.. 

dreamLOC eve think(NEG) don't we 

F3 (1): Then, I thought this was an earthquake, but 
(2): I had no experience, and 

201 


(3): But because somewhat Kobe was said to have no 

earthquake 

(4): There were a few earthquakes ever since I came [to Kobe] 

but.. 

(5): We did not think such a big earthquake would come even 

in a dream, did we. 

The speaker said line (5) "We did not even dream that we would 
have such a big earthquake, did we." as a socially accepted natural 
assumption shared by people. This is a kind of common-sense thought 
which should be considered to belong to the direct information territory 
of everyone (who lives in the area). Since the topic of this case 
involves geographic information, the case can also present "geographic 
closeness" of condition (b) of Corollary two. In the discourse, the 
speaker used an indirect ending for sentence (3) probably because of 
the 'distance' which she still felt with the area. She said that she moved 
to the area five years prior to the incident and did not consider herself 
to be a real 'local' resident yet. 

In summary, if certain information meets one of the four 
conditions of Corollary two, the information belongs to the speaker's 
information territory and he is entitled to use direct evidentials to 
express the information. Otherwise, the information belongs to someone 
else's information territory and so, in my model, even if the speaker has 
knowledge about the information, the use of indirect evidentials is 
desirable. 

202 


For a speaker, "other people's information territory" includes his 
hearer's information territory. It seems very important to clarify the 
conditions for information to be in the hearer's territory. Logically, 
Corollary two conditions should be straightforwardly applicable to 
characterize information in the hearer's territory. I think it is 
necessary to assume that a speaker has the same kind of criteria for the 
hearer's authorized information ownership. This leads to the next 
corollary: 

COROLLARY 3 (the hearer's information territory): 

A hearer's information territory which is assumed by the speaker 
contains the following four major types of information: 

(a) Information 
obtained through the hearer's past and current 
direct experience through visual, auditory, or other senses, 
including the hearer's inner feelings; 
(b) Information about people, facts, and things close to the hearer, 
including information about plans, actions, and behavior of the 
hearer or other people whom the hearer considers to be close, 
and information of places with which the hearer has a 
geographical relation; 
(c) Information 
embodying detailed knowledge which falls within 
the hearer's area of expertise (professional or otherwise). 
(d) Information which is unchallengeable due to its historically and 
socially qualified status as truth, and shared by the speaker. 
All these hearer conditions are applied to the knowledge status of 
the hearer as assumed or presupposed by the speaker. Presuppositions 
and assumptions are based on some kind of evidence; therefore, 
naturally this corollary for the hearer side is related to evidentiality. 

203 


Information in the hearer's territory but not in the speaker's 
territory is part of the target of the indirect evidentials which are to 
express information to which the speaker does not have direct socially 
authorized access. This framework, in which direct experience and 
indirect experience are contrasted, is, as is noted, based on the universal 
concept of evidentiality and is also relevant to the mental-space model 
in which direct and indirect memories are contrasted. As was described 
in chapter three, the mental-space theory (e.g, Takubo and Kinsui, 
1990) argues that both hearer's knowledge (assumed by the speaker) 
and other indirect information for the speaker reside in the speaker's 
indirect memory space, and are accessed and described through indirect 
linguistic forms. I believe this concept is logical. In my model, indirect 
evidentials have two target-information sub-types: the information 
which the speaker assumed to be hearer's, and information which is 
neither in the hearer's nor in the speaker's territory. 

The conditions from Corollaries one, two, and three are 
summarized figuratively in the following diagram: 

204



[5-10].@Direct/indirect evidentials and speaker's/hearer's information 
territory in the model

 (A) direct 
information in the speaker's 
evidentials territory 
Evidentials--

information only in the hearer's 
territory

 (B) indirect
evidentials
information outside of both 
speaker's and hearer's territory 

Information shared by the speaker and the hearer 

As the next stage, it is necessary to position information which is 
"shared" by both speaker and hearer in the model. Data from the 
informants indicated that there are a few different situations in which 
certain information is shared. 

Kamio's model has some problems concerning the issue of shared 
information. In his early model, in short, Kamio assumed one 
information category was shared by both speaker's and hearer's 
information territories (i.e., territory A in [4-21]). This shared 
information category was divided into three different levels in his later 
study (1994) as introduced in chapter three (pp. 80-81) as cases (B), (BC) 
and (CB) shown below again: 

205



[5-11] Three types of shared information between the speaker and the 
hearer by Kamio (1994) 

(B) 
the speaker considered that a given piece of information falls 
completely into both the speaker's and the hearer's territory of 
information [i.e., information is completely shared]; or 
information falls completely into the hearer's territory, and only 
partially into the speaker's territory. 
(Case B: n<Speaker��Hearer=1) 
(BC) 
the speaker considers that a given piece of information falls 
within his own territory to the fullest degree, while it falls within 
the hearer's territory to a lesser degree. 
(Case BC: 1=Speaker>Hearer>n) 

(CB) 
the speaker assumes that information falls within his own 
territory to some extent but falls more deeply within the hearer's 
territory (but the speaker does not necessary assume that it falls 
into the hearer's territory to the fullest degree). 
(Case CB: n��<Speaker<Hearer 

(1994:86-95) 

In the above, "n" is the threshold value for the speaker's or the 
hearer's territory, and the basic premise of Kamio for [5-11] above is 
"the assumption that information takes values between (and including) 
1 and 0 on the speaker's and the hearer's scale" assuming two linear 
psychological scales, one for the speaker and the other for the hearer 
(1994: 86): That is to say, a given piece of information can fall in one 
party's information territory to a great degree and at the same time it 
can fall in the other party's information territory to a small degree. At 
a glance, one may feel that this makes sense because we, as speakers, 
"feel" that we know some things really well and other things only to 
some extent. However, analyzing the data using concept [5-11] for 
shared information made me realize that the idea is problematic from 

206 


the viewpoint of evidentiality. 

In this study, the conditions for the speaker's/hearer's territory 
information are clarified by Corollaries two and three. The status of 
information in relation with the speaker's/hearer's territory is either 
IN or OUT: there is no partial fulfillment of the condition. Therefore, 
Kamio's information classification in [5-11] is not appropriate for this 
study. Kamio's classification in [5-11] that is based on the concept of 
relative distance among the speaker, the hearer, and the information is 
difficult to conceptualize concurrently with the conditions of the 

speaker's territory information.2 
However, as a matter of fact, there are cases which seem to 
present Kamio's (B), (BC), and (CB) situations in [5-11] on surface, but 
those cases show the difference of "owning information" and "knowing 
information": A speaker can have a piece of information out of his 
territory through hearing, deducing, inducing, and other ways, and 
naturally he can claim that he knows it, but actually the information 
may not be his own. The degree of evidence attached to each kind of 
information should theoretically be different. 
I simplified the concept of shared information in my model based 
on the above view. Cases in which a piece of information is completely 
shared by both parties are often available; for example, in the case 
where both conversationalists are exposed to the same on-going event. 
Utterances such as ii tenki desu ne# (It is a fine day, as we both know), 

207 


koko chotto urusai ne# (It is a bit noisy here, as we both know) are 
examples of the linguistic outcome of sharing direct experience. On the 
other hand, the speaker and the hearer do not necessarily share the 
same direct experience to allow the information to fall into both party's 
territories. For example, two parties can share knowledge from the 
same profession, knowledge from familiarity with the same places, or 
other knowledge to which they are both authorized to have privileged 
access. For example, two students taking the same course can have the 
same authorized knowledge about the course and say ano sensei kibishii 
ne# (That teacher is strict, as we both know). Two managers in the same 
business field may say to each other saikin, chotto keiki ga warui ne# 
(These days business is not good, is it?). The case of "complete sharing" 
of information is the only case in which information falls into both 
party's information territory in my model. 

The model assumes the following four cases of shared-
information in relation with the concept of information territory: 

[5-12] Types of shared-information in this study: 

(a) 
Information is completely shared in both speaker's and hearer's 
territory (i.e., direct information to both speaker and hearer), 
(b) Information is primarily in the speaker's information territory, 
but the hearer may know the information (i.e., the speaker's 
direct information, the hearer's indirect information), 

(c) 
Information is primarily in the hearer's information territory, 
but the speaker knows the information (i.e., the speaker's 
208 


indirect information, the hearer's direct information), 

(d) 
Information is neither in the speaker's territory nor the hearer's 
territory, but both parties may know it (i.e., both parties' indirect 
information). 
In the above four cases in which the information is "known" to 

both 
parties, types (a) and (b) information are the speaker's 

information, type (c) information is the hearer's information, and type 

(d) information is outside of both parties' territories. The outline of the 
whole system of the relationship between evidentiality and information 

types, which was briefly introduced in chapter four, is described here 

as follows: 

[5-13] Direct/indirect evidentials and speaker's/hearer's information 
territory 

direct INFORMATION IN THE SPEAKER'S TERRITORY 
evidentials 

(A) information that the speaker assumes the 
hearer does not know 
(B) information that the speaker assumes the 
hearer may know 
(C) information that the speaker assumes also 
falls into the hearer's territory 
Evidentials

 INFORMATION IN THE HEARER'S TERRITORY 

(D) information that the speaker does not 
know (question) 
(E) information that the speaker knows 
(reported or inferred evidence) 
indirect 
evidentials 

(F)INFORMATION OUT OF BOTH SPEAKER'S AND 

209 


HEARER'S TERRITORIES 

r


( eported or inferred evidence) 

The above chart [5-13] provides the basic framework of the 
evidentiality model of this study. Corollaries provide rules and 
characterization of evidential usage in the given framework. The 
nature of each of the proposition (or information) types (A) to (F) in 
chart [5-13] is illustrated in the following section in relation with 
commonly used sentence-ending evidential forms which were used for 
each proposition type. 

ANALYSIS OF THE SENTENCE-ENDING EVIDENTIAL FORMS 

As shown in the previous chapter, I have obtained a list of 
sentence-ending evidential forms from natural discourse data, and those 
forms were separated into groups according to their evidentiality types 
(cf. chapter four [4-5], also appendix B). "Direct forms" (D) are from 
Group (1) and (2) ending forms, "semi Direct forms" (SD) from Group (3) 
and (5) which are direct but acknowledge the hearer's knowledge, and 
"DQ forms" from Group (4) ending forms which are syntactically direct 
but seeking for the hearer's agreement. All of these three sub-groups 
of direct forms are listed as "D", "SD" or "DQ" in appendix B (list of 
ending-forms). "Indirect forms" are listed with ID [i.e., Group (7), (8), 
and (10) ending forms], AUXs are epistemic auxiliary endings [i.e., 
Group (9)], and "questions" are listed with Q [i.e., Group (6) and some 

210 


forms from other groups]. 

The relationship among the three factors, i.e., (1) the occurrence 
of the sentence-ending forms, (2) type of propositional content of the 
sentence (cf. [5-13] in this chapter, also Appendix D), and (3) the speech 
situation in which the sentence was uttered (cf. appendix A), was 
quantitatively and qualitatively analyzed. 

Information (i.e., proposition) types and sentence-ending evidential 
forms 

As earlier explained, finally, six basic types of information are 
assumed in the model (cf. [5-13]). Prior to the data analysis, I had 
expectations regarding popular forms of evidentials in each 
information type: information types (A) to (C) were expected to be the 
target for the direct evidentials; however, (B) and (C), due to the 
involvement of the hearer's knowledge, were expected to be expressed 
generally with semi-direct evidentials which are less direct than 
genuinely direct evidentials, and so on. I initially grouped sentence-
ending forms based on such expectations (cf. appendix B). 

However, the result did not so beautifully meet my expectations. 
There was fairly wide range of evidential usage in the same 
propositional type due to differences in speech situations and probably 
also due to each informant's personal preference, but certainly a set of 
observable systematic pattern of behavior were also found. Statistical 
data for the occurrence of sentence-ending forms for each proposition 
type is summarized and explained below. 

211 


The nature of (A) type propositions, i.e., "INFORMATION IN 
THE SPEAKER'S TERRITORY that the speaker assumed the 
hearer does not know" may not need to be further explained. The 
function of language for this type of proposition is transferring new 
information to the hearer. The utterance by a student (S2) in the 
following discourse between a teacher and a student, which was shown 
earlier, is a good example of (A) type propositions: 

(5-14) 

F26: 
amerika made dono kurai jikan kakatta ka oboeteru. 
USA till how long time took COMP remember 

S2: 
wasureta. neta yo. 
forgot slept PART(VOC) 

F26: Do you remember how long it took to go to America? 
S2: I forgot. I slept. 

I forget and I slept are direct expressions of the speaker's own 
experience. Therefore, this type of proposition was expressed by direct 
sentence-endings. This is reasonable intuitively as well as 
theoretically. 

Occurrences of sentence-ending forms are counted by groups for 
proposition (A) type utterances for different types of speech situations. 

The data from "formal conversation", "informal friend" and "family" 
discourses are listed below to indicate general preference by the 
informants: 

212



[5-15] Occurrences of sentence-ending forms by groups for (A ) type 

propositions for formal conversation discourse, informal friend 

discourse, family discourse, and combined discourse types 

ENDING-FORMS FORMAL FRIEND FAMILY ALL 

TYPES 
Group of evidentials 
G1 (direct) 629 (59%) 591 (73%) 490 (79%) 2370 (72%) 
G2 (D rapport -ne) 299 (28%) 132 (16%) 77 (12%) 556 (17%) 
G3 (SD tag question.) 5 ( 0%) 37 ( 4%) 2 ( 0%) 44 ( 1%) 
G4 (DQ direct 40 ( 3%) 12 ( 1%) 9 ( 1%) 67 ( 2%)

 but questioning) 
G5 (SD "sharing" ne#) 2 ( 0%) 0 ( 0%) 0 ( 0%) 3 ( 0%) 
G6 (Question forms) 17 ( 1%) 5 ( 0%) 7 ( 1%) 33 ( 1%) 
G7 (ID inference) 1 ( 0%) 1 ( 0%) 0 ( 0%) 2 ( 0%) 
G8 (ID hearsay) 2 ( 0%) 0 ( 0%) 1 ( 0%) 4 ( 0%) 
G9 (Auxiliary) 14 ( 1%) 0 ( 7%) 5 ( 0%) 21 ( 0%) 
G10 (ID I think) 55 ( 5%) 25 ( 3%) 29 ( 4%) 154 ( 4%) 
total 1064 803 620 3254 

Clearly, across speech situations, Group (1) and Group (2) type 

ending forms were preferred for (A) type propositions. In formal 

discourse and informal friend discourse the following ten ending-forms 

were most popular among informants. 

[5-16] Top ten sentence-ending forms for type (A ) proposition, formal 
conversational discourse 

SENTENCE-ENDING FORMS OCCURRENCE

 1. G(1) D direct (formal) 122 (11%) 
2. G(1) D n da yo (formal) 90 ( 8%) 
3. G(1) D n dakedo (formal) 79 ( 7%) 
4. G(2) D ne. (formal) 77 ( 7%) 
5. G(1) D n da (formal) 74 ( 6%) 
6. G(2) D n da yo ne . (formal) 62 ( 5%) 
7. G(2) D direct (informal) 58 ( 5%) 
8. G(2) D n da ne .(formal) 52 ( 4%) 
9. G(1) D kara (formal) 38 ( 3%) 
10. G(2) D yo ne . (formal) 28 ( 2%) 
213 


83 others 384 (36%) 
total 1064 
[5-17] Top ten sentence-ending forms for type (A ) propositions, 
informal friend discourse 

SENTENCE-ENDING FORMS 
OCCURRENCE

 1. G(1) D direct (informal) 176 (21%) 
2. 
G(1) D no . (informal) 128 (15%) 
3. 
G(2) D no ne .(informal) 64 (7%) 
4. 
G(1) D no yo (informal) 41 (5%) 
5. 
G(1) D kedo (informal) 32 ( 3%) 
6. 
G(1) D noun (informal) 32 ( 3%) 
7. 
G(1) D yo (informal) 30 ( 3%) 
8. 
G(1) D sa (informal) 30 ( 3%) 
9. 
G(1) D kara (informal) 29 ( 3%) 
10. G(1) 
D n dakedo (informal) 24 ( 2%) 
49 others 217 (27%) 
total 803 
Although Group (1) type direct endings were dominant in both 

discourse types, (A) type propositions were expressed with more 

assertive sentence-ending forms in informal discourse. In informal 

discourse, particles -no, -yo, and -sa, which are fairly assertive, were 

preferred together with most informal noun-ending forms. In formal 

discourse, descending -ne . (i.e., rapport -ne) was used in 14% and the 

-n da cluster (e.g. "explaining", "sedning empathy") was used in 29% of 

sentences implying that the informants are more sensitive to the 

existence of the hearer in formal discourse. 

From the entire result of the analysis of the data for each 

discourse type, it seems reasonable to assume that Group (1) type 

ending-forms and rapport -ne from Group (2) represent the most 

preferred sentence-ending evidentials for proposition type (A). Also it 

seems that the formality of conversation motivates the speaker to use of 

214 


the -n da cluster and Group (2) type rapport -ne. Furthermore, formal 
discourses involved more frequent occurence of indirect endings for 
this type of proposition, such as Group (4) ending forms (i.e., a direct 
sentence but seeking the hearer's agreement) and also Group (10) "I 

think -endings.3 

(B) INFORMATION IN THE SPEAKER'S TERRITORY 
and the speaker assumes the hearer may have some 
knowledge : 
From the universal theory of evidentiality, propositions of this 
type should be expressed with direct evidentials also. However, as the 
speaker assumes that the hearer has some knowledge of his proposition, 
some difference from the (A) type proposition is expected. 

(5-18) 

M22: Tatoeba, uchi nanka issetai de okane 
for example my household QUOT one household in money 

kaseide kuru no boku dake desho .

 earn come COMP me only isn't it 
M22: For example, in case of our household, I am the only person 
who earns money aren't I? 

In this utterance, M22 talked about his household matter, which 
is private, but he assumed that the hearers knew that the proposition is 
true. 
(5-19) 
M4 (1): uchi no USA de seerusu man de iwayuru 

our POSS USA office LOC salesman so-called 

215 


nihon no seerusu man ga hashirimawatteru no de 

Japan MODI salesman OBJ running-around(STAT) COMP 

nenshuu 
yearly income 
30 man 
300,000 
doru toka 40 man doru 
dollars etc. 400,000 dollars 
to ka 
etc. 
sonna mon da 
such thing COP 
yo. 
PART(VOC) 
F5 (2): 3 man doru desho . 
30,000 dollar isn't it 

M4 (1) : Our salesman in USA, who is so-called, as in Japan, a 
salesman who is moving around receives $300,000 or 
$400,000 or only like that. 

F5 (2) : It is $30,000, isn't it . 

In this discourse, F5 corrected the figure that M4 introduced in 

(1) assuming that he made a simple calculation mistake, knowing the 
correct figure should be $30,000. The entire topic is in M4's territory, 
but the proposition in F5(2) (America's average salary) is F5's territory 
information as she lives in America. F5 believed that her proposition in 
(2) was known by the hearer M4 although M4 gave different figures in 
the previous sentence. 
As these cases suggest, for (B) type propositions, "confirmation 
daroo ." form and negative-ending janai ., both of which are tag-
question forms with falling intonation, were preferred as expected, but 
this is largely true of informal discourse. In formal discourse, more 
question type ending forms were observed. For (B) type propositions, 
Group (1) to Group (4) ending forms were generally preferred with 

216 


differences in each discourse type as shown in the following three 

charts: 

[5-20] Occurrence of sentence-ending forms by groups for type (B) 
propositions, for formal conversation discourse, informal friend 
discourse, family discourse, and combined discourse types 

ENDING-FORMS 
FORMAL FRIEND FAMILY ALL TYPES 

G1 (direct) 3 ( 6%) 0 ( 0%) 9 (21%) 20 (11%) 
G2 (D rapport -ne) 9 (19%) 7 (15%) 12 (29%) 38 (22%) 
G3 (SD tag question.) 6 (13%) 28 (62%) 9 (21%) 50 (29%) 
G4 (DQ direct but questioning) 19 (41%) 9 ( 0%) 10 (24%) 46 (26%) 
G5 (SD sharing ne#) 3 ( 6%) 0 ( 0%) 0 ( 0%) 4 ( 2%) 
G6 (Question forms) 5 (10%) 1 ( 2%) 1 ( 2%) 9 ( 5%) 
G7 (ID inference) 0 ( 0%) 0 ( 0%) 0 ( 0%) 0 ( 0%) 
G8 (ID hearsay) 0 ( 0%) 0 ( 0%) 0 ( 0%) 0 ( 0%) 
G9 (Auxiliary) 0 ( 0%) 0 ( 0%) 0 ( 0%) 2 ( 1%) 
G10 (ID I think) 1 ( 2%) 0 ( 0%) 0 ( 0%) 2 ( 1%) 
total 46 45 41 171 

. 

[5-21] Top ten sentence-ending forms for type (B) propositions, formal 
conversation discourse 

SENTENCE-ENDING FORMS OCCURRENCE

 1. 
Group (4) DQ daroo .(tag-Q, formal) 6 (13%) 
2. 
Group (4) DQ yo ne . (formal) 6 (13%) 
3. 
Group (4) DQ ne .(formal) 4 ( 8%) 
4. 
Group (3) SD daroo . (tag-Q, formal) 3 ( 6%) 
5. 
Group (2) D yo ne . (formal) 3 ( 6%) 
6. 
Group (6) Q ka . (formal) 2 ( 4%) 
7. 
Group (1) D n da (formal) 2 ( 4%) 
8. 
Group (6) Q n desu ka . (formal) 1 ( 2%) 
9. 
Group (2) D da ne . (formal) 1 ( 2%) 
10. Group (3) 
SD n daroo . (tag-Q, formal) 1 ( 2%
) 
16 others 16 (34%
)
total 46
217



[5-22] Top ten sentence-ending forms for type (B ) propositions, 
informal friend discourse 

SENTENCE-ENDING FORMS 
OCCURRENCE

 1. Group (3) SD ja nai . (tag-Q, informal) 16 (35%) 
2. Group (3) SD daroo . (tag-Q, informal) 7 (15%) 
3. 
Group (4) DQ quasi-q intra-s (informal) 3 ( 6%) 
4. Group (3) SD daroo . (tag-Q, formal) 3 ( 6%) 
5. Group (2) D yo ne . (informal) 2 ( 4%) 
6. Group (4) DQ ja nai . (tag-Q, informal) 2 ( 4%) 
7. 
Group (4) DQ quasi-q ending (informal) 2 ( 4%) 
8. Group (2) D no ne . (informal) 2 ( 4%) 
9. Group (2) D n da ne . (formal) 2 ( 2%) 
10. Group (6) 
Q ja nai ka . (formal) 1 (11%) 
5 others 5 (11%) 
total 45 
The total number of (B) type propositions was relatively small. 
This is partly due to the difficulty of categorizing utterances into this 
particular type. Although it was often not difficult to find proposition 

(B) type utterances from the background information I had about the 
speakers and also the propositions, to what degree the speakers should 
expect their propositions to be known to the hearers was sometimes 
difficult to know. Accordingly, the volume of (B) type data remains 
small because sentences that are ambiguous in terms of proposition-
type were excluded from the analysis. 
From the limited data, it is still observable that Group (3) type 
ending forms (ie., falling tag-questions), janai.(isn't it?) and daroo. 
(isn't it?) were preferred for (B) type information. In formal 
conversation, question type forms (Group 6) and direct sentences with 

218 


questioning intonation (i.e., DQ, Group 4 type) were used more often 
than the expected G(3) type endings. It is presumable that question-like 
utterances were preferred in formal discourse to show the speaker's 
respect to knowledge which the hearer possibly has. For informal 
discourse, more assertive Group (2) type endings (rapport-ne) were also 
used, suggesting there is a wide variety of choice of ending forms 
among speakers for this type of proposition. It may be concluded that 
Group (3) type descending-tone tag-questions represent the most-
preferred evidential type for (B) type propositions. In less formal 
speech situations, Group (2) type rapport-ne is also common, and in 
high formality situations, Group (4) type "seeking-agreement" 
evidentials as well as real questioning endings are preferred. 

(C) INFORMATION IN THE SPEAKER'S TERRITORY 
that the speaker assumes also falls into the hearer's 
territory: 
This type of proposition meets the conditions of information in 
both speaker's territory and hearer's territory; both parties have 
socially authorized primary access to the propositional information. 
Some examples of (C) type propositions are shown below: 
(5-23) 
M11 (1): jitsuwa ne, doomo oomu shinrikyoo ga sarin o 

in fact RAPP it looks Aum-cult NOM Sarin OBJ 

fukumeta dokugasu o tsukutteiu to iu uwasa wa 

including poison gas OBJ making(PROG) QUOT rumor TOP 

219 


sono mae no toshi kara atta-n-desu ne.. 
that before MODI year from existed-n-COP(FOR) PART (RAPP) 

F22:(2) 
matsumoto sarin-tte iu no ga arimashita ne .. 
Matsumoto Sarin-QUOT COMP NOM exited PART(CONF) 

demo are mo oomu ka dooka wakaranakatta desho . 

but that also Aum whether or not was not clear wasn't it 

ano jiten de wa.
point of time LOC CONT


M11: Actually, there was somewhat a rumor from the previous 
year that Aum-cult seems to be producing poison gas 
including Sarin gas. 

F22: There was a case of Matsumoto-Sarin, wasn't it?. 
But, they didn't know that was done by Aum, did they? 
At that time. 

M11 is a journalist investigating the Aum-cult case, and F22, who 
also seems to know the case well as a TV commentator, assumed that the 
propositions in her sentences are shared by the hearer, M11. The event 
of Matsumoto-Sarin is a well-known historical fact. F22's utterances end 
with tag-questions with a rising tone. 

In the next speech, M13 is talking about the life of a Japanese 
Sumoo-wrestler. Since the topic is widely shared knowledge among 
Japanese people, the speaker assumed the hearer has the same 
information in her information territory: 

(5-24) 

220 


M13 (1) : osumoo-san-tte juuni kurai de 
Sumoo-wrestler-QUOT 12 years old about TEMP 

nyuumon-shite ne., 

enter the world PART (COMF) 

(2): de kodomo kara are zutto shitete 
then child years from that constantly doing(PROG) 

(3): de sanjuuni 
then 32 years old 
kurai 
about 
de 
TEMP 
intaishite toshiyori ni 
retire(te) senior DAT 
naru 
become 
deshoo . 
don't they 

M13(1): Speaking about Sumoo-wresters, they enter the world of 
sumoo at the age of 12 or around (am I right?). 
(2): Then continue to do that [=sumoo] persistently 
(incomplete).. 
(3): Then retire at the age of 32 or around and become 
Toshiyori (lit. old man), don't they? 

In the two examples above, the use of the ending forms of tag-
question with a rising tone, and "confirming -ne " indicates that the 
speaker assumed that the proposition was shared by both parties' 
territories. The form that I had particularly expected for this 
propositional type was "sharing -ne #", which was also observed 
frequently in formal discourse. Informal discourse showed a high 
frequency of direct forms. DQ forms were preferred in all discourse 
types. 

221



[5-25] Occurrence of sentence-ending forms by groups for type (C ) 
propositions, for formal conversation discourse, informal friend 
discourse, family discourse, and combined discourse types 

ENDING-FORMS FORMAL FRIEND FAMILY ALL TYPES 

G1 (direct) 31 (11%) 19 ( 9%) 78 (31%) 297 (28%) 
G2 (D rapportive -ne) 34 (12%) 33 (17%) 40 (16%) 140 (13%) 
G3 (SD tag question.) 3 ( 1%) 9 ( 4%) 11 ( 4%) 31 ( 2%) 
G4 (DQ direct 88 (31%) 78 (40%) 82 (34%) 314 (30%)

 but questioning) 
G5 (SD sharing-ne#) 102 (36%) 32 (16%) 24 ( 9%) 182 (17%) 
G6 (Question forms) 10 ( 3%) 9 ( 4%) 11 ( 4%) 50 ( 4%) 
G7 (ID inference) 0 ( 0%) 0 ( 0%) 0 ( 0%) 0 ( 0%) 
G8 (ID hearsay) 1 ( 0%) 0 ( 0%) 0 ( 0%) 2 ( 0%) 
G9 (Auxiliary) 4 ( 1%) 5 ( 2%) 1 ( 0%) 10 ( 0%) 
G10 (ID I think) 3 ( 1%) 6 ( 3%) 0 ( 4%) 14 ( 1%) 
total 276 191 247 1039 

[5-26] Top ten sentence-ending forms for type (C ) proposition, formal 
conversation discourse 

SENTENCE-ENDING FORMS 
OCCURRENCE

 1. 
G(5) SD yo ne# (share, formal) 45 (16%) 
2. 
G(5) SD ne# (share, formal) 45 (16%) 
3. 
G(4) DQ daroo . (tag-Q, formal) 23 ( 8%) 
4. 
G(4) DQ ne. (confirm, formal) 21 ( 7%) 
5. 
G(4) DQ yo ne. (formal) 20 ( 7%) 
6. 
G(2) D ne . (formal) 17 ( 6%) 
7. 
G(1) D dakedo (formal) 11 ( 3%) 
8. 
G(4) DQ n da ne .(confirm, formal) 8 ( 2%) 
9. 
G(2) D yo ne . (formal) 6 ( 2%) 
10. G(5) 
SD kara ne# (confirm, formal) 4 ( 1%
) 
48 others 76 (27%
)
total 276
222



[5-27] Top ten sentence-ending forms for type (C ) proposition, informal 
friend discourse 

SENTENCE-ENDING FORMS 
OCCURRENCE

 1. 
G(4) DQ daroo .(tag-Q, informal) 21 (10%) 
2. 
G(4) DQ ja nai .(tag-Q, informal) 21 (10%) 
3. 
G(5) SD yo ne# (share, informal) 17 ( 8%) 
4. 
G(2) D yo ne . (informal) 11 ( 5%) 
5. 
G(4) DQ n ja nai . (tag-Q, formal) 10 ( 3%) 
6. 
G(5) SD ne # (share, informal) 7 ( 3%) 
7. 
G(2) D ne . (informal) 7 ( 3%) 
8. 
G(3) SD ja nai . (tag-Q, informal) 6 ( 3%) 
9. 
G(2) D ne . (informal) 6 ( 3%) 
10. G(1) 
D direct (informal) 
6 ( 3%) 
53 others 
79 (41%
)
total 191


In both discourse types, Group (5) 
type-ending-form, ne# 

("sharing -ne"), and Group (4) type tag-questions with a rising tone 

(daroo., janai.) were preferred. Therefore, these types can represent 

the sentence-ending evidentials for (C) type propositions although 

Group (1) and (2) type evidentials, which are fairly assertive, were also 

used. 

Simple direct endings which were not even most preferred in 

expressing (A) type propositions (i.e., only speaker's information) 

appeared to be popular in the combined results of all types of discourse 

(10% share with formal forms and informal forms combined). This is 

due to frequent use of simple direct forms in family discourse (12%), 

courtroom discourse (42%), and school teacher discourse (12%) for type 

(C) propositions. It seems that in these discourse types, shared 
223 


knowledge tends to be treated in direct forms for different reasons. 
Analysis of this phenomenon will be discussed in a later section. 

(D) 
INFORMATION THAT THE SPEAKER ASSUMED TO BE IN THE 
HEARER'S TERRITORY, and that the speaker does not 
know. 
Naturally, this kind of proposition, when uttered, seeks 
information from the hearer so that it is expressed in the form of a 
question or a statement form with a clearly questioning intonation. A 
Japanese formal question is formed by simply adding the particle -ka at 
the end of statement. Therefore, in a formal sentence, (D) type 
information is expressed in a sentence ending with -desu ka, -masu ka, 
-ja arimasen ka, deshoo ka, and other combinations of a formal 
sentence-ending form plus ka. However, in informal conversation, the 
questioning particle ka is hardly used. A direct sentence-ending with 
rising tone is the most popular way to express (D) type propositions 
casually. Also particle no with a rising-tone is often used to make an 
informal question sentence, as seen in the following two examples: 

(5-28) 

F22 (1) : (Looking at a picture of M15's cat) 

Ogyoogi yoku, hai doozo-tte iu, ne#. 

behave well "here I am"-QUOT, PART(SHAR) 

kichitto suwatte. 

neatly sit (STAT) 

224 


(2): osu 
.
male
M15(3): osu desu ne.
.
male COP(FOR) PART(RAPP)


F22 (1): (Looking at a picture of M15's cat)

 It sits neatly, as if saying "Here you are [take my picture]

 (2): Male? (direct noun ending) 

M15(3): It is male. 

(5-29) 

F13 (1): America mo shoohi-zei aru no 
.
America also consumption-tax exist 
Q


F5 (2)
: 
aru yo. eeeto, hatten-go paasento. tekisasu wa.
exist PART(VOC) well 8.5% Texas CONT


F13(1): Do you have consumer tax in America? (direct rising no.) 

F5(2)
: 
Yes, we do. Well, it is 8.5% in Texas. 

Statistical data for formal and informal friend discourse showed 

similar results except for the point which has just been explained. 

[5-30] Occurrences of sentence-ending forms by groups for type (D ) 
propositions, for formal conversation discourse, informal friend 
discourse, family discourse, and combined discourse types

 ENDING -FORM FORMAL FRIEND FAMILY ALL TYPES 

G1 (direct) 8 ( 4%) 1 ( 0%) 5 ( 3%) 21 ( 3%) 
G2 (D rapportive -ne) 2 ( 1%) 3 ( 2%) 0 ( 0%) 5 ( 0%) 
G3 (SD+ tag question) 0 ( 0%) 0 ( 0%) 0 ( 0%) 1 ( 0%) 
G4 (DQ direct 4 ( 2%) 0 ( 0%) 5 ( 3%) 13 ( 1%)

 but questioning 
G5 (SD sharing ne#) 1 ( 0%) 1 ( 0%) 0 ( 0%) 1 ( 0%) 
G6 (Question forms) 163 (87%) 121 (92%) 139 (92%) 615(90%) 
G7 (ID inference) 0 ( 0%) 0 ( 0%) 0 ( 0%) 0 ( 0%) 
G8 (ID hearsay) 4 ( 2%) 0 ( 0%) 0 ( 0%) 4 ( 0%) 
G9 (Auxiliary) 3 ( 1%) 2 ( 1%) 1 ( 0%) 14 ( 2%) 
G10 (ID I think) 1 ( 0%) 3 ( 2%) 0 ( 0%) 5 ( 0%) 
total 186 131 150 679 

225 


[5-31] Top ten sentence-ending forms for type (D ) proposition, formal 
conversational discourse 

SENTENCE-ENDING FORMS OCCURRENCE

 1. 
G(6) Q ka .(formal) 52 (27%) 
2. 
G(6) Q ka . (formal) 21 (11%) 
3. 
G(6) Q n desu ka . (formal) 17 ( 9%) 
4. 
G(6) Q desu ka . (formal) 16 ( 8%) 
5. 
G(6) Q direct . (formal) 14 ( 7%) 
6. 
G(6) Q n desu ka . (informal) 12 ( 6%) 
7. 
G(6) Q noun . (informal) 9 ( 4%) 
8. 
G(6) Q direct . (informal) 7 ( 3%) 
9. 
G(6) Q no . (formal) 5 ( 2%) 
10. G(6) 
Q ka . (informal) 5 ( 2%) 
21 others 28 (15%) 
total 186 
[5-32] Top ten sentence-ending forms for type (D ) propositions, 
informal friend discourse 

SENTENCE-ENDING FORMS 
OCCURRENCE

 1. 
G(6) Q no . (informal) 51 (38%) 
2. 
G(6) Q direct . (informal) 25 (18%) 
3. 
G(6) Q noun . (informal) 16 (12%) 
4. 
G(6) Q direct . (formal) 8 ( 6%) 
5. 
G(6) Q ka .(informal) 6 ( 4%) 
6. 
G(6) Q wake . (informal) 6 ( 4%) 
7. 
G(6) Q ka . (formal) 4 ( 3%) 
8. 
G(6) Q n desu ka . (formal) 2 ( 1%) 
9. 
G(2) D yo ne .(formal) 2 ( 1%) 
10. G(6) 
Q wake desu ka. (formal) 1 ( 0%) 
12 others 10 ( 7%) 
total 131 
Casual question forms such as -no? and -wake? are dominant in 

informal discourse along with direct-form endings and single-noun 

utterances with a rising tone. Direct endings with a rising tone and 

single-noun endings with a rising tone are frequently used in family 

226 


discourse suggesting the casualness of the forms. Representatives for 
this category will be "(N)(desu) ka.", "no.", "direct-ending with a rising 
tone", and "noun-ending with a rising tone". 

(E) 
INFORMATION THAT THE SPEAKER ASSUMES TO BE IN 
THE HEARER'S TERRITORY, and that the speaker has 
some knowledge about it. 
The following utterance was said when speaker (F22) was 
watching "shadow pictures" with the artist who created them, listening 
to the artist's explanation and asking him questions. Although the topic 
was already shared as the speaker's direct experience by viewing them 
directly, still the artist himself had the primary access to the 
proposition concerning the production process. Therefore, statements 
from F22 about the products are an (E) type proposition: 

(5-33) 

F22: 
konohen no usui inu nanka ironna 
this area POSS thin dog QUOT various 

iro o kasanete irassharu wake deshoo . 

color 
OBJ make layers do(HON) aren't they 

F22: The pale colors of this area are produced by making layers of 
various colors, aren't they? 

In the following example, the speaker F5 is talking about a civil 
servant's post-retirement life with M5, who is a civil servant. F5's first 
utterance (1) is a question, and (3) is a simple inference from M5's 

227 


answer to line (1). Since the entire topic is in M5's information 
territory, statement (3) is also an example of (D) type propositions: 

(5-34) 
F5 (1): ano koomuin wa teinen wa rokujuu 
Well, civil servant CONT retirement age TOP 60 years old 
desu ka . 
COP(FOR) Q 
M5(2): soo 
so 

F5 (3): ja rokujuu ijoo wa i-rare-nai-n desu yo ne . 
then, 60 over CONT stay-POT-NEG-n-COP(FOR) VOC RAPP 
M5 (3): soo. 
so 

F5 (1): Is civil servant's retirement age 60? 

M5 (2): It is so. 

F5 (3): Then, you cannot stay in the office after 60, am I right? 

M5 (4): It is so. 

As with type (B) propositions, the volume of data for this 
proposition type is fairly limited (a total of 349 utterances). Since the 
hearer has the primary access to the target information in (E) type 
propositions, intuitively expected forms for this proposition type were 
some kind of questioning forms (Group 5-DQ and Group 6-Q) or indirect 
forms (e.g. Group 8 - hearsay). 

Quantitative data supported this expectation as shown in the 
following [5-35] although there seemed to be a wider range of choice. 
The proportion of syntactically indirect forms, Group (6) to Group (10), 
was between 20% to 45% across different types of discourse, but a 

228 


considerably large proportion of preferred use of direct forms with 

questioning nuances (Group 4-DQ evidentials) such as tag-questions 

(daroo., janai.) and the "confirming-ne." makes the total preference of 

indirectness very high for this proposition type: 

[5-35] Occurrences of sentence-ending forms by groups for type (E ) 

propositions, for formal conversation discourse, informal friend 

discourse, family discourse, and combined discourse types

 ENDING-FORMS FORMAL FRIEND FAMILY ALL TYPES

 G1 (direct) 
7 ( 4%) 4 ( 7%) 4 ( 9%) 30 ( 8%)

 G2 (D "rapport" -ne) 2 ( 1%) 1 ( 1%) 2 ( 4%) 7 ( 2%)

 G3 (SD tag question.) 3 ( 1%) 3 ( 5%) 4 ( 9%) 13 ( 4%)

 G4 (DQ direct 69 (45%) 26 (51%) 19 (43%) 139 (40%)

 but questioning)

 G5 (SD "sharing" ne#) 8 ( 5%) 0 ( 0%) 5 (11%) 16 ( 4%)

 G6 (Question forms) 29 (19%) 15 (29%) 9 (20%) 76 (20%)

 G7 (ID inference) 1 ( 0%) 0 ( 0%) 0 ( 0%) 1 ( 0%)

 G8 (ID hearsay 24 (15%) 0 ( 0%) 0 ( 0%) 30 ( 8%)

 G9 (Auxiliary) 
8 ( 5%) 1 ( 1%) 1 ( 2%) 13 ( 3%)

 G10(ID I think) 
0 ( 0%) 1 ( 1%) 0 ( 0%) 24 ( 6%)

 total 
151 51 620 349 

[5-36] Top ten sentence-ending forms for type (E ) propositions, formal 
conversational discourse 

SENTENCE-ENDING FORMS 
OCCURRENCE

 1. 
G(4) DQ daroo . (tag-Q, formal) 15 ( 9%) 
2. 
G(4) DQ yo ne . (confirm, formal) 13 ( 8%) 
3. 
G(4) DQ ne .(confirm, formal) 12 ( 7%) 
4. 
G(8) ID n da tte yo (hearsay, formal) 10 ( 6%) 
5. 
G(4) DQ n da ne .(confirm, formal) 9 ( 5%) 
6. 
G(4) DQ n daroo . (tag-Q, formal) 7 ( 4%) 
7. 
G(6) Q daroo ka . (formal) 7 ( 4%) 
8. 
G(4) DQ n da yo ne . (confirm, formal) 6 ( 3%) 
9. 
G(6) Q n desu ka . (formal) 6 ( 3%) 
10. G(6) 
Q ka . (formal) 5 ( 3%) 
38 others 61 (40%) 
total 151 
229 


[5-37] Top ten sentence-ending forms for proposition (E ), informal 
friend discourse 

SENTENCE-ENDING FORMS 
OCCURRENCE

 1. 
G(4) Q -kke . (informal) 
6 ( 9%) 
2. G(6) DQ janai .(tag-Q, informal) 
4 ( 7%) 
3. G(4) DQ n daroo . (tag-Q,informal) 
3 ( 5%) 
4. G(4) DQ ne . (confirm, formal) 
3 ( 5%) 
5. G(6) Q n ja nai no .(NEG Q,informal) 3 ( 5%) 
6. G(4) DQ n ja nai . (tag-Q, informal) 
2 ( 3%) 
7. G(6) Q n desu ka .(informal) 
2 ( 3%) 
8. 
D(1) D direct (informal) 
2 ( 3%) 
9. 
G(1) D kara (informal) 2 ( 3%
) 
21 others 25 (49%
)
total 51
Since the proportion of other forms which are not listed is large 
in the above two figures, we cannot really induce a conclusion here. At 
least, it is observable that question-type endings (Q and DQ) are the most 
preferred in both discourse types, and indirect hearsay forms are used 
only in formal discourse, suggesting that speakers do not consider the 
hearer's information as secondhand information when he has some 
information about the proposition except in highly formal settings. 
Also in informal discourse, the appearances of negative suffixes with 
rising tones such as -n janai., janai., janai no. and question forms for 
confirmation such as -kke? (Was it such and such?) insinuate that the 
speaker may be emphasizing the existence of knowledge on his side in 
informal speech. 

The combined data from all discourse types show that 
"confirmation -ne." (including -yo ne. and -n da ne. ), which was also 

230 


popular to express completely-shared information (i.e. proposition type 
C), was used for 13% of the instances of (E) type propositions. Genuine 
questions with question particle -ka, which was used for the hearer's 
information (D type propositions), and the indirect form "I think" also 
appeared in (E) type proposition sentences, treating the hearer's 
information territory as 'distant' information. In addition, considering 
that 64% of all utterances belonging to proposition (E) are not in the top 
ten list, it should be concluded, again, that the speakers' choice of 
distinctive ending-forms cannot be generalized for this proposition 
type. At least, the simple summation of the occurrence of ending-forms 
by groups shows that Group (4) (DQ) and Group 6(Q) type ending forms 
are the most preferred sentence-ending types for (E)-type propositions. 

(F) INFORMATION OUT OF THE BOTH SPEAKER'S AND HEARER'S 
TERRITORIES 
Representatives of this type of information are the ones that a 
speaker obtained from other people's talk or writing and inferred 
information from observable evidence or reasoning based on logic, 
intuition, previous experience, and other mental constructs (Willet, 
1988). Although examples of such use may not be necessary, I will show 
a sample below: 

In (5-38) speaker M1 was talking about a famous book written 
by a Japanese author. The author currently lives in Princeton as a 
visiting professor and wrote about his experience in dealing with VIP 

231 


Japanese officials studying in the university. So M1's speech is 

basically hearsay with his commentary that is inference. 

(5-38) 

M1 (1)
: 
yohodo amerika e kite kara ne ., 
very much America DIR came after PART(RAPP) 

(2)
: 
sono jibun no nanteiukana eriito-ishiki ni aa 
well, oneself POSS what-to-say elite-consciousness LOC well 

kageri ga detekita-tte iu ka na. 

shadow NOM showed-QUOTE wonder PART(RAPP) 

(3)
: 
sono eriito-ishiki o hakki dekiru yoona bamen ga 
that elite-consciousness OBJ display able such scene NOM 

nai kara usseki-shiteru-n-daroo na. 

NEG because frustrated-n-probably PART(RAPP) 

(4)
: 
dakara, tamani au nihonjin o tukamaete 
therefore once in a while meet Japanese people OBJ grab(te) 

kaikoo 
ichiban watashi wa nihon-ja jitsuwa 

opening mouth at first I TOP Japan LOC in fact 

hensa-chi ikura ikura to iu sugu hensachi 
o 

deviation-rate* such and such QUOT soon deviation-rate OBJ 

mochidasu-n-datte. 

bring forth-n -hearsay 

*An individual's deviation score is simply the difference from 
their actual score and the average score in the nation-wide 
university enterance exam (similar to the SAT or ACT in the 
United States), and is one of the most important factors in being 
accepted at a university. 

(5)
: 
dakara Murakami wa ne, souiu sono
therefore, (author's name) TOP RAPP such that


kanryoo no eriito ne 

official 
MODI elite PART(RAPP) 

232 


(6)
: 
purinsuton atari ni ryuugaku-shite-kuru,
Princeton around LOC study-overseas-do-come


maa aru imi dewa eriito chuu no eriito kamoshirenai 

say, in a sense elite among MODI elite might be 

F5 (7)
: 
aa soo deshoo ne.. 
well, so probably PART (RAPP) 

M1 (8)
: 
soo-iu renchuu no koto o ne. 
so-QUOT people POSS matter OBJ PART(RAPP) 

junsui-baiyoo-gata-hensachi-ningen-tte 

pure-cultured-style-deviation rate-people-QUOT 

itten-da yo na.. 

said-COP VOC RAPP 

F5 (9): Murakami Haruki-tte omoshiroi hito mitai desu ne.. 
Murakami Haruki-QUOT interesting person seems COP(FOR) CONF

 (10)
: 
tabun, soo desu yo ne.
.
probably so COP(FOR) VOC RAPP


M1 (1): To a great extent, ever since they came to America, 

(2): Well, their own "elite-conscious ego", well what shall I 
say, shall I say like their elite-consciousness got 
depressed. 

(3)
: 
Probably their frustration may get pent-up since they 
have no place to show-off their elite-conscious-ego. 

(4): So it is said that they readily bring up their "deviation 
value" [at the time of university entrance exam] when 
they have occasional chances to meet Japanese people. 

(5): So the author Murakami calls those government 
officials like that. 
(6): Those who come to America, such as to Princeton to 

study, who might be, in a sense, the "elitist" elite. 

F5 
(7): Well, they may be so. 

M1 
(8): Murakami calls those people "purely-cultured

deviation-value-form-human beings". 

233 


F5 (9): Murakami appeared to be an interesting person. 
(10): Probably he is so. 
Speaker M1, from my point of view, has a preference for direct-
style speech. The preference may be due to his background; he belongs 
to an older generation (70s) and had held managerial positions in the 
trading business for most of his life. The informant's personal data) 
shows that he preferred to use direct forms for proposition types (B), as 
well as (C), both of which are shared information. However, in talking 
about the above topic (i.e., Japanese officials in Princeton), he kept 
some distance between the information and himself, probably due to the 
fact that he was very conscious of the fact that the episode was from a 
book. Clear hearsay evidential -n da tte (I heard) is used in line (4), an 
indirect evidential -tte iu ka (I wonder I should say) is in line (3), and 
mitai (appeared to be) is in line (9), auxiliaries of conjecture -n daroo 
na (probably) is seen in line (4), 
-kamoshirenai (might be) in (6), and deshoo ne. (probably) in line (7). 
So the example fully represents an (F) type conversation. 
The difference between an (F) type proposition and the previous 

(E) type proposition is that an (F) type proposition does not fall in the 
hearer's information territory, and the common condition between the 
two types are that speaker has some out-of-territory knowledge about 
the proposition. Therefore, theoretically, from the speaker's point of 
view, (E) and (F) type propositions are not different in terms of 
"evidence" on his side. But I had expected some, possibly minor, 
234 


different results between the two in terms of evidentiality coding since 

(F) type information is more "distant" from both conversationalists. As 
Kamio assumed in his theory, I had anticipated the appearance of a 
large number of indirect evidential forms, i.e., inference (Group 7 
endings) and hearsay (Group 8 endings), which were not popular 
enough for (E) type propositions. The result meets the expectation in 
that indirect forms, as well as auxiliary endings, were preferred for this 
proposition type. However, the most preferred form as a whole was, 
unexpectedly, Group (1) type direct-endings as the following chart 
indicates: 
[5-39] Occurrences of ending-forms by group, for type (F) propositions, 
from all discourse types, formal conversation discourse, and 
informal friend discourse 
SENTENCE ENDING FORMS OCCURRENCE 

All types Formal Friend Family 
G1 (direct) 263 (28%) 13 ( 7%) 102 (29%) 89 (40%) 
G2 (D rapport -ne). 68 ( 7%) 20 (11%) 32 ( 9%) 12 ( 5%) 
G3 (SD tag question.) 14 ( 1%) 1 ( 1%) 6 ( 1%) 4 ( 1%) 
G4 (DQ direct but 96 (10%) 10 ( 5%) 43 (11%) 32 (14%) 
questioning) 
G5 (SD sharing ne#) 11 ( 1%) 3 ( 1%) 3 ( 1%) 5 ( 2%) 
G6 (Question) 26 ( 2%) 7 ( 4%) 7 ( 2%) 3 ( 1%) 
G7 (ID inference) 89 ( 9%) 28 (16%) 32 ( 9%) 11 ( 5%) 
G8 (ID hearsay) 183 (19%) 24 (14%) 74 ( 21%) 36 (16%) 
G9 (Auxiliary) 85 ( 9%) 32 (18%) 20 ( 5%) 17 ( 7%) 
G10 (ID I think) 96 (10%) 31 (18%) 25 ( 7%) 9( 4%) 
931 169 344 218 

In formal discourse, although assertive direct endings were 
observed to be 18% (G1 + G2) of the data, the informants preferred Group 

(7) to (10) endings that are indirect. The overall occurrence of indirect 
235 


forms in formal discourse was 72% (the sum of Group 6 to 10) for (F) 
type propositions. Therefore, it is reasonable to conclude that speakers 
are indirect enough formally in talking about other people's matters. 
On the other hand, in informal discourse among friends and family 
members Group (1) type direct-endings occurred most often, although 
the sum of indirect forms that occurred in informal speeches was large 
enough (e.g. 55% for informal friend discourse) to posit indirect forms 
as "recommended" forms for this proposition type. 

The preference of direct forms in informal discourse for (F) 
propositions can be understood from the "politeness" point of view. 
Being different from (E) type propositions which are owned by the 
conversational partner, the speaker does not have immediate need to be 
polite to the owner of (F) type information who is not present at the 
speech site. This may enhance extension of the speaker's information 
territory, and accordingly, the use of direct evidentials. As a matter of 
fact, formal discourse data suggests that higher formality requires a 
speaker to carefully attend to the use of direct forms which are 
supposed to be for propositions from the speaker's information 
territory. However, at the same time, the speculation that Japanese 
people always treat other people's information as indirect information 
(cf. chapter three) turned out to be false, in a sense, in this research 
(also see later section for further discussion). 

In the following, the most popular sentence-ending-forms for 
proposition (F) are listed for formal conversation discourse and 

236 


informal friend discourse: 

[5-40] Top ten sentence-ending forms for type (F) propositions, formal 
conversational discourse 

SENTENCE-ENDING FORMS 
OCCURRENCE

 1. 
G(10) ID omou ("I think"-formal) 8 ( 4%) 
2. 
G(10) ID omou n da kedo ("I think"-formal) 8 ( 4%) 
3. 
G(2) D n da ne .(formal) 5 ( 2%) 
4. 
G(9) AUX conjecture daroo ne .("probably"-formal) 5 ( 2%) 
5. 
G(9) AUX conjecture daroo . ("probably"-formal) 5 ( 2%) 
6. 
G(7) ID mitai da kedo ("It appears"-formal) 5 ( 2%) 
7. 
G(8) ID -to kiita kedo ("I heard"-formal) 4 ( 2%) 
8. 
G(8) ID n da tte ("It is said"-formal) 4 ( 2%) 
9. G(9) AUX conjecture daroo ne# ("probably"-formal) 4 ( 2%) 
10. G(2) 
D -ne . (formal) 4 ( 2%) 
81 others 117 (70%) 
total 169 
[5-41] Top ten sentence-ending forms for type (F) propositions, 
for informal friend discourse 

SENTENCE-ENDING FORMS 
OCCURRENCE

 1. 
G(1) D direct (informal) 
30 ( 8%) 
2. 
G(8) ID n da tte ("I heard"-informal) 24 ( 6%) 
3. 
G(4) DQ ja nai . ("Isn't it?"-informal) 18 ( 5%) 
4. 
G(8) ID n datte ("I heard"-informal) 17 ( 4%) 
5. G(1) D no yo (informal) 15 ( 4%
) 
6.. G(1) D no (informal) 14 ( 4%
) 
7. 
G(8) ID n datte ("I heard"-formal) 12 ( 3%) 
8. 
G(1) D yo (informal) 11 ( 2%) 
9. 
G(7) ID rashii no ne . ("It appear" -informal) 10 ( 2%) 
10. G(1) 
D noun (informal) 9 ( 2%) 
88 others 184 ( 53%) 
total 344 
Again, since the proportion of the other forms that are not listed 

is large in the above two tables. The forms chosen are spread out widely 

across Group (6) to Group (10) for formal discourse, and all the groups 

237 


except the question group for informal discourse. Therefore, in 
combined data, there were no particular ending forms which were used 
in more than 10%. 

Due to differences in preference across discourse types, it is 
difficult to choose representative forms for type (F) propositions for the 
model, but probably it is reasonable to assume Group (7)-inference 
forms, Group (8)-hearsay forms as the starndard froms for all discourse 
types, and for informal speech situations, direct forms such as a simple 
direct-ending, noun-endings, vocative -no and -yo endings should be 
added. Group (10), "I think"-type endings, should also be added to the 
group of formal speech for proposition (F). 

So far, I have explained the differences among the six types of 
information in relation with the concept of "having information in 
territory" and "knowing information". In addition to the basic six types 
of information, in the process of data analysis for this study, two 
additional types were also assumed for experimental purposes. These 

additional categories are : (G) public information, and (H) self-
talk (talking to oneself). 
First, category (G), public information category, was 
originally included in category (F), information outside of both 
speaker's and hearer's territories since public information is usually 

reported information. My earlier studies found that Japanese speakers 
kept treating public information as other people's information (i.e., 
outside of their information territory) linguistically with indirect 

238 


hearsay forms. However, this time, during the course of data collection, 
it was noted that quite a high proportion of the informants used direct 
expressions occasionally in describing publicly well-known 
information. I speculated that this was due to my selection of topics 
(when available) which were highly digested among the community 
members. If public information is understood as belonging to the 
speaker's territory, the modification of the definition of the 
information within the speaker's territory would be necessary to 
involve mere "knowledge" Therefore, in order to locate the position of 
public information within the evidentiality framework, this category 
was separated. 

As expected during the data collection, Group (1) direct-endings 
were more preferred than indirect hearsay endings for (G) type 
proposition. 

[5-42] Occurrence of sentence-ending forms by groups for type (G ) 

propositions, for formal conversation discourse, informal friend 

discourse, family discourse, and combined discourse types 

ENDING-FORMS FORMAL FRIEND FAMILY ALL TYPES 

G1 (direct) 5 (17%) 100 (35%) 56 (50%) 168 (38%) 
G2 (D rapport -ne) 4 (14%) 18 ( 6%) 4 ( 3%) 26 ( 5%) 
G3 (SD tag question.) 0 ( 0%) 17 ( 5%) 3 ( 2%) 20 ( 4%) 
G4 (DQ direct 6 (21%) 40 (14%) 22 (19%) 68 (15%) 
but questioning) 
G5 (SD sharing ne#) 1 ( 3%) 0 ( 0%) 3 ( 2%) 4 ( 0%) 
G6 (Question forms) 3 (10%) 25 ( 8%) 8 ( 7%) 36 ( 8%) 
G7 (ID inference) 6 (21%) 16 ( 5%) 1 ( 0%) 23 ( 5%) 
G8 (ID hearsay) 1 ( 3%) 62 (21%) 11 ( 9%) 75 (17%) 
G9 (Auxiliary) 0 ( 0%) 1 ( 0%) 1 ( 0%) 2 ( 0%) 
G10 (ID I think) 2 ( 7%) 6 ( 2%) 2 ( 1%) 15 ( 3%) 
total 28 285 111 437 

239



The total number of occurrences of (G) type propositions was too 
limited for formal discourse to positively identify trends. Data for 
informal discourses, both friend and family, show that informants 
preferred Group (1), Group (4), and Group (8) forms for type (G) 
propositions. This result is also applicable to the combined data of all 
discourse types. The results clearly indicate that for a large proportion 
of the informants, very well-known public information belongs to 
everybody's information territory, and therefore, G(4) "seekingagreement" endings were also preferred. For some speakers, this 
information is socially accepted truth, so they used Group (1) direct 
evidentials. 

Theoretically legitimate evidentials for type (G) propositions may 
be Group (7) (inference) and Group (8) (hearsay) type indirect forms, 
and they were actually frequently used. However, at the same time, for 
some speakers, whether or not the information is shared by the hearer 
is more important than who originally owned the information. This 
view also explains the high frequency of Group (4) type ending-forms 

(i.e. shared-information evidentials). The preference for Group (1), (4), 
and (8) ending forms for (G) propositions (public information) is very 
similar to the result for (F) type propositions (information outside of 
both speaker's and hearer's territories), suggesting that speakers may 
perceive the two categories as being almost identical. The observed 
high co-relation between the two can be explained as follows: Public 
240 


information (G) is, theoretically, a subset of other people's information 
(F). The only difference is that (G) propositions are known to all as 
truth, while (F) propositions are not necessarily known to all. However, 
in actuality, conversationalists tend to speak about (F) propositions 
which are known to their partners. Thus we see a marginal difference 
between the results of the analysis on (G) and (F) type propositions. 

Information type (H), the speaker's talk to himself, is actually 
beyond the scope of this research, since the main scope of the study is 
epistemic modality of the proposition by which the speaker expresses 
his degree of commitment to the truth value of his proposition in the 
presence of the hearer. Mackey (1968) considered language to have 
two distinctive functions: external functions and the internal functions. 
Regarding internal functions, he assumed language is used for 
counting, reckoning, cursing, dreaming, diary writing, and note 
taking. Mackey's view is different from the well-known classifications 
of language functions proposed by Jakobson (1960) or Hymes (1968) in 
that Mackey paid attention only to whether or not language use is aimed 

at communicating with someone outside of the speaker. 
A speaker makes frequent self-talk-style utterances in 
conversation with hearers; often these utterances obviously are not 
directed to the hearer but the speaker lets the hearer hear the 
utterances. See the following example: 

(5-43) 

241 


F18 (1) : sensei mo ningen da kara suki kirai -tte 
teacher also human COP because favorite-QUOT 
aru deshoo# to iu ka, hachoo no au seito 

have don't they?(SHAR) or chemical harmonious student 

to 
hachoo no awanai seito -tte aru deshoo. 

and chemical disharmonious student-QUOT exist don't they? 

dakara soo iu tokoro ni ittan hairikon-dara 

therefore such place LOC once enter-COND 

moo unn ga warui-tte koto ni naru deshoo.

 already luck NOM bad-QUOT COMP DAT become isn't it? 

F5 (2): soo desu ne# 
soo COP SHAR

 (3) 
: taihendaa..... gakkoo -tte 
rough school COMP 
F18(4) : unn .. 
yes 

F5 
(5): sorede doo nasatta-n-desu ka? 
then how did(HON)-n-COP(FOR)-Q

 (6) : gakkoo kaenakatta-n- 
desu ka? 
school transferred-n -COP(FOR)-Q 
F18 : (shook head) 

F5 (7) : ja, ganbatta-n-da... 
then hanged on-n-COP 

F18 (8) : demo are wa yappari shippai deshita ne E. 
but that TOP as expected mistake COP(FOR) COMF 

18(1)
: 
As teachers are human beings, they have their favorite 
students and non-favorites, don't they? Or I should say, 
there are students who go along with the teachers, and 
others who do not, aren't there? Therefore, once you got 
stuck in that kind of place, you are just unlucky, aren't 
you.. 

242 


F5 (2): It is so, isn't it. 
(3): School is difficult (plain VOC form)... 
F18(4): Yes... 

F5 (5): Then, what happened?

 (6): Did he chang schools? [F18 shook head] 
(7): Then he hung on (plain form)... 
F18(8): But that was after all a wrong decision. 

This conversation is formal between two female speakers with 
an age difference. There is no substantial power difference between 
the speakers except for the age factor, and they are fairly close as they 
have had a good relationship for a period of ten years. But the language 
form was formal. Speaker F5 (the younger one) occasionally showed the 
intimacy she felt toward the speaker F18. In doing so, speaker F5 used 
plain form utterances but they were not, on the surface, directed to 
speaker F18; speaker F5 formed them in a self-talk-style as in utterances 

(3) and (7). 
In this way, a speaker may use self-talk type speech as a 
discourse strategy. Ikuta (1983) viewed speech level shifts from polite 
forms (e.g. desu, masu) to plain forms (e.g. da) as often being used to 
signal the flow of empathy between speakers since polite endings are 
primarily an expression of social or attitudinal "distance" which the 
speaker perceivs between his addressee. Ikuta further argued that 
plain forms are also used to organize the discourse effectively in 
expressing "illustrative instances" within a discourse. She argued that 
plain form use in basically-formal conversation produces different 

243 


social "space" within the same discourse. Although all conversational 
utterances were regarded to have been targeted to the addressees in 
Ikuta's research, the function of plain form use in formal speech is 
regarded as "strategic", as my analysis of sentence ending forms for (H) 
type proposition suggests. Since self-talk type utterances are supposed 
to be made without taking a hearer into consideration, their modality is 
naturally direct. Category (H) was added to track speech behavior 
without a hearer, whose presence was speculated to be crucial in 
Japanese evidentiality concept. 

The total number of occurrences of (H) type propositions in the 
entire discourse data was small (i.e., 164), but the informants' 
preference of direct forms for this proposition type was clearly 
observable. Since (H) type utterances should be considered informal, in 
discourses with restricted speech situations such as in public talk, 
school, and courtroom, no (H) type utterances were found. It seems 
there is no substantial difference in evidential forms across the types of 
discourse with which (H) type utterance occurred: 

[5-44] Occurrences of sentence-ending forms by groups for type (H ) 

propositions, for formal conversation discourse, informal friend 

discourse, family discourse, and combined discourse types

 ENDING-FORM FORMAL FRIEND FAMILY ALL TYPES

 G1 (direct) 54 (73%) 13 (24%) 8 (25%) 78 (47%) 
G2 (direct + rapportne) 4( 5%) 5 ( 9%) 13 (41%) 23 (14%) 
G3 (SD tag question.) 0 ( 0%) 0 ( 0%) 0 ( 0%) 0 ( 0%) 
G4 (DQ direct 4 ( 0%) 0 ( 0%) 0 ( 0%) 0 ( 0%) 
but question-type) 
G5 (SD sharing ne#) 0 ( 0%) 0 ( 0%) 0 ( 0%) 0 ( 0%) 

244



G6 (Question forms) 11(15%) 30 (55%) 10 (32%) 50 (32%) 
G7 (ID inference) 0 ( 0%) 0 ( 0%) 0 ( 0%) 1 ( 0%) 
G8 (ID hearsay) 1 ( 1%) 0 ( 0%) 0 ( 0%) 1 ( 0%) 
G9 (Auxiliary) 1 ( 1%) 4 ( 7%) 0 ( 0%) 5 ( 3%) 
G10 (ID I think) 2 ( 2%) 2 ( 3%) 0 ( 0%) 4 ( 2%) 
total 73 54 31 164 

Group (1), Group (2), and Group (6) type ending forms were 
preferred across discourse types for (H) type propositions. Naturally, 
the forms were all informal. With Group (1) and Group (2) type direct-
endings, i.e., "simple-direct forms" and "noun-endings", "vocative na or 
naa endings" are used, and also "n-da cluster" and "rapport -ne" were 
used as if a speaker was explaining his own utterance to himself. In 
Group (6) question forms, forms with descending -ka. forms, -ka + na. 
forms, are used in such a way that the speaker asks question to himself. 

THE MODEL 

Based on the data analysis and the discussion, I propose the 
framework of Japanese sentence-ending evidentials in [5-45] in 
relation to the types of propositions. The framework will function with 
the proposed corollaries. 

245



246



247



248



In short, the model indicates the commonly-preferred pattern of 
evidential codings among Japanese speakers at the sentence ending; 
thus, there are wider varieties of codings which do not strictly conform 
to this model yet are pragmatically acceptable. In constructing the 
model, I did not assume an ideal perfect speaker, but simply tried to 
realize a generalizable pattern of evidential usage in reality from the 
perspective of speech situation and propositional context. 

The basic nature of this evidential framework is "speakerorientation" and "hearer-sensitive". The model is speaker-oriented 
because all information which falls in the speaker's territory, solely or 
shared with the hearer, is considered to be direct evidence since the 
speaker is socially entitled to have primary access to the information. 
This view matches the "mental-space" view in that all of the speaker's 
direct information is supposed to be stored in the speaker's direct 
memory area, or the speaker can presumably access this type of 
information most directly. Therefore, when he utters a sentence, the 
speaker first needs to determine whether or not the proposition is 
within the reach of his primary access; whether or not the information 
is in his direct memory. If it is, the speaker uses direct evidentials, if 
not, some kind of indirect evidential form is preferred. 

Regarding propositions which fall in the speaker's information 
territory, in actual speech to specific hearers, three different types of 
propositions were assumed, i.e., (A), (B), and (C) in [5-45]. The 
difference among these three types deal with the hearer's knowledge or 

249 


information territory. Both information of type (B) propositions and 
type (C) propositions are shared with the hearer's but distinction 
between the two types of propositions in the speaker's psychology is 
empirically supported by sentence-ending evidential forms used by 
informants across variety of speech situations. Therefore, I would like 
to argue that the model shows a generalized "pattern of preference" in 
choosing sentence-ending forms among Japanese speakers regarding 
(A), (B), and (C) type propositions. Accordingly, if an individual does 
not differentiate among (A), (B), and (C) proposition types by sentence-
ending evidential forms to the degree that the speech situation 
requires, his language behavior can be problematic, i.e., he might be 
considered to be offensive to his hearer. 

In the same way, proposition types (D) and (E) are used for 
information to which the hearer has primary access even though the 
speaker has some knowledge based on hearsay or inference about the 
proposition (i.e., E type). It is preferred by the speaker to identify each 
of these two types of information by different questioning-style 
sentence-ending evidentials. Syntactic questioning forms (G6) were 
preferred for (D) type propositions, and direct sentences with 
questioning intonation (G4) were preferred for (E)-type propositions. 
consider these two types of sentence-ending evidential forms to be 
"indirect" in effect. 

(F) type information is another genre of indirect evidentials for 
obvious reasons. However, the data demonstrated that the opposite type 
250 


of evidential forms, direct evidentials, were also preferred for 
propositions with (F) type information. Low formality of situation 
seems to enhance the occurrence of direct evidentials for this 
information type. In using direct-endings, the speakers did not overly 
mark, in a sense, the difference between their "own" information and 

(F) type information. On the other hand, they showed careful 
consideration about the status difference between their "own" 
information and the hearer's information in dealing with (C), (D) and 
(E) type information. A possible explanation is that some speakers tend 
to express (F) type information rather assertively because they may 
naturally be more caring about the hearer than some other people who 
"own" (F) type information but are not present at the time of the 
utterance. The same tendency was seen with (G) type "public 
information" which also entailed the occurrence of both most-direct 
evidentials and indirect evidentials being most popular. 
The second possible explanation for the speakers' preference for 
direct forms may be the high truth value attached to public or other 
peoples' information. (G) type propositions were found to be treated as 
"distant" information with indirect hearsay or inference forms in my 
earlier study, however, this time, public information was found to be 
often spoken with direct forms. This may have been caused by the 
choice of topics; the topics which I used for informal discussions were 
very well-known among the informants so as to be considered to be in-
territory information. Naturally, if certain public information has 

251 


been continuously noted, digested, developed, discussed, and analyzed 
for a sufficiently long period of time, and if it is very closely related to 
people's daily life, the topic may become everybody's own-territoryinformation to some extent, thus meeting condition (d) of Corollary two, 
namely "information which is unchallengeable by the hearer due to its 
historically and socially qualified status as 'truth'". When one first 
heard that Princess Diana was killed by an accident, hearsay forms may 
be used initially to convey the information to others, but after hearing 
details of the accident, seeing photos of the accident, and witnessing the 
reaction of the society, one will stop using hearsay forms. The 
information has penetrated into people's mind as fact. In such a way, 

(G) type information can be "information which is unchallengeable by 
the hearer". The same phenomena can occur with (F) type information 
when it is shared within a group as truth. In this respect, Labov and 
Fanshel's distinction between O-Events (events which are publicly 
known as truth) and D-Events (events which are considered to be 
disputable) may be applicable to Japanese (cf. chapter three) as far as 
this research result is concerned. 
But it should be noted that there was also a large proportion of 
utterances that handled public information as out-of-territory 
information. It seems that the discourse type, formal or casual, makes a 
difference in the choice of evidentials for (G) type and (F) type 
propositions. As a matter of fact, carelessly chosen direct evidentials by 
a speaker for a third person's information as a referent can be 

252 


offensive to the hearer since the behavior can be seen as an overextension of the speaker's information territory. Discussion on this 
point will continue in a later section in relation with politeness. 

The model also reflects theories of Japanese modality expressions, 
where most evidentials occur. Sentence-ending evidentials have 
interactional functions as discourse markers in that they function to 
inform the hearer of the speaker's purposes in uttering a sentence: 
transferring new information (direct ending), reminding of the 
hearer's knowledge (tag-question. etc.), confirming that information is 
shared ("sharing -ne", etc), requesting new information (question), and 
so on. Sentential and noun-phrasal modalities (e.g. deixis) work to 
involve the hearer's knowledge in the speaker's utterances by 
connecting the speaker's knowledge with the hearer's knowledge of 
any given proposition, and enhance smooth communication by showing 
the speaker's respect to the hearer's information territory and 
knowledge. 

It should be emphasized again that the rules that this model 
proposes are genuinely pragmatic; they are not part of prescriptive 
grammar and thus never explicitly taught at school, and therefore 
Japanese speakers seem to learn these rules through interaction with 
people. For the same reason, and also due to the nature of the 
evidentiality rules which are not well understood, in Japanese-as-aforeign-language class the use of sentence-ending forms are not 
systematically explained. On the other hand, as we noted, following 

253 


these rules which represent the general preference of Japanese 
speakers may be crucial to be a good community speaker of the 
language. In fact, even social stigmatization may be expected against 
offenders. Therefore, this model may be of help in the field of language 
teaching in that it shows in an organized way how to end sentences, in 
relation with the proposition types, in order to be a competent speaker 
of Japanese. 

To construct the evidential model framework, so far the data was 
mainly viewed from proposition-types with which evidentials appeared. 
Then, the data was analyzed with other perspectives such as discourse 
types. Results are shown in the following sections. 

DIRECT AND INDIRECT SENTENCE-ENDING EVIDENTIALS 

This study started with the popular observation that Japanese 
speech is indirect. Then, the question was asked as to whether the data 
supports this belief. Data analysis of (F) type propositions indicates that 
Japanese utterances may not be as indirect as they have been 
considered to be. Statistical analysis that simply added up the 
occurrences of direct and indirect sentence-endings show that Japanese 
speakers use more direct sentence-ending evidentials than expected in 
all discourse types (i.e., speech situations) as in the following [5-46] : 

254



[5-46] Sentence-ending evidential forms for all discourse types in total 
Male Female Student Total 
Direct forms 1796 (66%) 2111 (52%) 203 (75%) 4110 (58%) 
Direct Q 213 ( 7%) 526 (12%) 5 ( 1%) 741 (10%) 
Semi Direct 86 ( 4%) 303( 7%) 3 ( 1%) 392 ( 5%) 
Indirect 329 (12%) 383 ( 9%) 16 ( 5%) 728 (10%) 
AUX 71 ( 2%) 79 ( 1%) 2 ( 0%) 152 ( 2%) 
Question 200 (7%) 657 (16%) 41 (15%) 898 (12%) 
Total 2695 4059 270 7024 
The results indicate that Japanese sentences frequently end with 
direct evidentials. Direct sentence-ending forms were found to be 
dominant in all discourse types as shown below in[5-47] 

Since the figures in [5-46] and [5-47] are simple summations of 
occurrences of forms without consideration of the propositional content 
of the sentences, they do not suggest any statistical meanings for a 
realistic model of evidentiality. However these figures do provide an 
overview of Japanese indirectness. First, Japanese speakers use direct 
speech in approximately half of all speech opportunities. This does not 
appear to be "overly indirect", however, since relevant data from other 
languages are not available for comparison, it is an open question as to 
whether or not Japanese speech is significantly more indirect than 
some universal norm. Obviously, "semi-direct forms" (SD) and "direct 
forms with questioning nuance"(DQ) contribute to the indirect nature 
of Japanese speech. Although SD and DQ sentences end with direct 
forms of verbs, adjectives and the copula, they still demonstrate the 

255 


[5-47] Sentence-ending forms for each discourse type 

direct 
forms 
DQ Semidirect 
indirect 
forms 
AUX question 
forms 
total 
formal 
conversational 
discourse 
1124 
(56%) 
236 138 
(11%) ( 6%) 
188 
(9%) 
62 
(3%) 
245 
(12%) 
1993 
public 
discourseinformal 
friend 
discourse 
268 
(66%)
1061 
(55%) 
34 
( 8%) 
209 
(10%) 
27( 6%)
135( 7%) 
41 
(10%) 
253 
(13%) 
3 
( 0%)
33 
( 1%) 
28 
(6%) 
213 
(11%) 
401 
1904 
family 
discoursecourtroom 
discourse 
(prosecutor)
courtroom 
discourse 
(defendant)
school 
discourse 
899 
(61%) 
175 
(54%) 
251 
(80%) 
332 
(52%) 
179 
(12%) 
10 
( 3%) 
0 
76 
(12%) 
70 
( 4%)
2( 0%)
020 
( 3%) 
100 
( 6%) 
66 
(20%) 
55 
(17%) 
25 
( 3%) 
26 
( 1%)
9 
(2%)
4(1%)
15 
(2%) 
188 
(12%) 
62 
(19%) 
0 
( 0%) 
162 
(25%)
1462 
324 
310 
630 

speaker's sensitivity to the hearer's knowledge via interactional 
sentence-final particles, negative forms, rising intonations, and so on. 

This point is figuratively shown in [5-48] below by the 
occurrence of Group (1) ending-forms in each proposition type in 
comparison with other direct forms: Group (2) to Group (5). Group (1) 
consists of only direct endings and direct plus vocative type suffixes. 
Group (2) forms are direct forms with the hearer-conscious rapport -ne. 

256 


Group (3) and Group (5) are semi-direct, and Group (4) is DQ. For the 
proposition types, (B), (C), (D), and (E), direct Group (1) forms were not 
preferred, even though the speaker may have knowledge of (B), (C), 
and (E) type propositions. 

[5-48] Occurrences of G(1) to (5) type direct endings in the proposition 
types (B), (C),(D), and (E) in all discourse types

 Proposition type

 G(1) direct 

G(2) to (5) direct 
ending 

ending

 (B) In the speaker's 
inf. territory, the 
hearer may have some 
11% 

79%
knowledge 

(C) In both speaker's 
and hearer's inf. 
28% 

62% 
territory 

(D) In the hearer's inf. 
territory only 
3% 

1% 

(E) In the hearer's inf. 
territory and speaker 
8% 

48% 
has knowledge 

On the other hand, as [5-49] shows, Group (1) type sentence-
ending forms which indicate the speaker's high commitment to his 
proposition are dominantly used in describing (A) type propositions 
(i.e., information in the speaker's territory that the hearer does not 
know), and also found frequently in (G) type propositions (i.e., publicly 
known information), and (H) type propositions (i.e., speaker's self-talk). 
It also occurred fairly often with (F) type propositions (i.e. information 
outside both speaker's and the hearer territory) although indirect 

257 


forms were used more often for (F) type propositions than Group (1) 
direct-ending forms. 

[5-49] Occurrence of Group (1) to Group (5) endings in proposition 
types (A), (F), (H), and (G) 

Proposition type G( 1 )
ending 
d i r e c t G(2) 
direct 
to G(5) 
ending 
(A) In the speaker's 
information territory, the 
hearer has no knowledge 72% 20% 
(F) Out of both speaker's 
and hearer's information 
territory 
28% 19% 
(G) Public information 
38% 24% 
(H) The speaker's self-talk 
46% 27% 

The popularity of Group (1) type direct-ending forms in (A) and 

(H) proposition-type sentences is understandable for obvious reasons: 
both (A) and (H) type propositions are "speaker-only" information in 
which the hearer's knowledge is not anticipated (for A) or the speaker 
temporarily pretends ignorance of the hearer's existence for a strategic 
purpose (H). However, the high occurrence of G(1) type direct-
sentence-endings was not expected for (F) and (G) type sentences as 
discussed earlier. 
Japanese speech then can be fairly direct with certain 
propositions. However, if we limit the scope to basic forms of direct

258 


ending forms without interactional suffixes, the occurrence of such 
basic forms is fairly rare although they are the major forms taught and 
used in Japanese-as-a-foreign-language class until learners become 
fairly proficient in the language. Those forms are generally -desu or 
masu (and related forms) for formal endings and -da (and related 
forms) for informal endings. These basic forms of direct-endings do not 
convey the speaker's interactional concerns so they may be considered 
to be too straightforward and simple for use in conversation. For (A) 
type propositions, basic forms of direct endings are used to some extent, 
but for other proposition types, basic direct endings were rarely used 
even for describing the speaker's own information. The informants, 
being aware of the hearer's presence, used Group (1) and (2) type 
sentence-final particles to suffix basic forms of direct endings. The 
next figurative chart [5-50] indicates the occurrence of basic direct 
forms in each propositional type: 

259



[5-50] Occurrences of basic forms of direct-sentence-endings in each 
proposition type [D direct (formal/informal] 

Proposition type 
basic forms of formal 
direct ending (-desu, 
masu, etc.) 
basic forms of 
informal direct 
ending (-da, etc.) 
(A) In the speaker's 
info. territory, out of 
the hearer's 
knowledge 
(total: 3254) 
415 (12%) 533 (16%) 
(B) In the speaker's 
info. territory, the 
hearer may have some 
knowledge 
(total: 171) 
2 (1%) 1 (0%) 
(C) In both speaker's 
and hearer's info. 
territory 
(total: 1039) 
59 (5%) 61 (5%) 
(D) In the hearer's 
info. territory, out of 
the speaker's 
knowledge 
(total:349) 
1 (0%) 8 (1%) 
(E) In the hearer's 
info. territory, the 
speaker also has some 
knowledge 
(total: 349) 
4 (1%) 3 (0%) 
(F) Out of both parties' 
info. territory 
(total: 437) 
6 (0%) 93 (9%) 
(G) Publicly known 
information 
(total: 437) 
0 (0%) 63 (14%) 
(H) The speaker's self-
talk 
(total:164) 
0 (0%) 61 (37%) 
260



As the above table indicates, basic forms of direct endings, both 
formal and informal, are not often used in Japanese discourse except for 

(A) type propositions (i.e., information belongs to the speaker's 
territory, the hearer does not have knowledge about it) and (H)-type 
propositions (i.e, the speaker's self-talk). The data support the 
assumption that the use of basic direct-sentence-ending forms is 
seriously limited pragmatically in Japanese discourse. 
Now we turn to indirect forms (or ID forms). ID forms are almost

 exclusively used for the proposition type (F), and also (G) to a lesser 
extent (cf. [5-39] and [5-42]). This result conforms to the universal view 
of evidentiality in that a proposition with indirect evidence results in 
an indirect expression, although the use of indirect sentence-endingevidentials is also influenced by the discourse type. 

Although the proportion of indirect sentence-ending forms was 
not as high as had been expected, it seems that individuals sometimes 
make conscious choices between using direct forms and indirect forms. 
When engaging in an informal conversation, a female speaker (F2) 
declared that she was very knowledgeable in a variety of both worldwide and domestic gossip topics (she said "Ask me, ask me anything!"), 
and so the amused hearers started to ask questions and F2 pretended to 
be an "all-knowing housewife" and spoke in this fashion for half an 
hour. The following data compares F2's normal speech and her "allknowing housewife" speech for (F) and (G) type propositions (cf. 
Appendix E). In her "pretend" speech, the use of indirect forms 

261 


decreased from 60% of regular speech to 40%, suggesting that the 
speaker's view of the relationship between herself and the proposition 
affected her choice of evidentiality. 

[5-51] Evidential-choice shift for speaker F2 

Total number of F+G 
proposition sentences 
Number of INDIRECT 
and QUESTION (inc. DQ) 
sentence-ending forms 
Regular F2 speech 144 68(ID)+18(Q) 60% 
"Reporter" F2 speech 60 20(ID)+4(Q) 40% 
Data from public talk discourse suggest that a reporter who is 
talking to the public tends to treat information he is transferring to 
unspecified hearers as his own information, i.e., (A) type proposition. 
Although F2's "reporter" speech in [5-51] still involved a high 
proportion of indirect endings (40%), obviously the speaker consciously 
shifted her choice of ending forms to "reporter" modality responding to 
the hearer's expectations. This case also suggests that generally direct 

evidentials are less preferred for (F) and (G) type propositions. 
On the other hand, it was observed that propositions which 
obviously belonged to a speaker's information territory were 
occasionally expressed in an indirect sentence-ending form. This 

phenomenon happens when the speaker does not wish to express the 
closeness of the proposition to himself. For example, one witness for the 
Yakugai-AIDS court case, when uttering a statement that was seemingly 
inconsistent with his own previous testimony, had a problem with 

262 


ending the utterance with a direct ending, paused for a while, and used
an indirect ending for a sentence describing his own behavior:
(5-52)
M17: watashitachi wa shinkenni Teikyoo daigaku no


 we TOP seriously Teikyo University POSS 
shoorei o kangaeteita wake de arimasu kedomo 
cases OBJ thought(STAT) HON but 
maa, tookyoku ni mo kono ten o gosoodansiteita 
well, concerned Ministry DAT also this point OBJ consulted(PROG) 
to iu wakede gozaimasu. shikashi sono jiki 
COMP reason HON however that time

 ni tsukimashite wa yahari shimbun ni happyoo 

concerning CONT as understood newspaper LOC announcement 

ga 
atta jiten to iu koto de (pause) 

NOM 
did time QUOT COMP by means of 

kyoo wa ohanashi shite okimasu. 
today CONT HON-speak (te) 

M17: 
We were seriously thinking about the cases [of AIDS 
patients] of Teikyo University Hospital, well, we were 
consulting with the Ministry [of Health] on this point. But 
as to the date [of our actual contact with Teikyo], today I, 
tentatively in a preparatory action for the future,]say that 
it was the time when Teikyo announced [the cases of AIDS 
patients] to newspapers, 

The speaker was the chief of an AIDS-patient-designating 
committee who was suspected of being partially responsible for the 
delay of the recognition of the first AIDS patient in Japan. It was also 
suspected that he knew that Teikyo University had twenty-three HIV 

263 


positive patients but concealed the fact for a year in collaboration with 
Teikyo University Hospital. Those hemophiliac patients were all 
infected with AIDS through the use of unheated blood-forming 
enhancer which the United States had cautioned earlier not to use. It is 
suspected that the Ministry purposefully permitted the continued use of 
this licensed medical product. Weeks before, Speaker M17 once testified 
that he inquired at Teikyo University about the twenty-three HIV 
positive patients one month before the Teikyo's press release on the 
Japanese first officially "recognized" AIDS patient who lived in America 
then. However, in the above statement he testified that he actually 
contacted Teikyo regarding hemophiliac patients after the press 
release, contradicting his own previous statement. In testifying the 
above (the second testimony), the speaker could not smoothly present 
the final-sentence-modality coding, as if he had difficulties in deciding 
on the appropriate sentence-ending mode and finally finished the 
sentence with the phrase "kyoo wa ohanashi shite okimasu" (I speak 
like this today) in which okimasu connotes "tentative behavior": a very 
unusual lexically-indirect ending. Since in Japanese the modality 
marker is at the very end of the senence, in uttering a given sentence, 
psycholiguistically speaking, the speaker has some time to decide the 
sentence modality. The speaker paused for a while before the sentence 
ending, and the hearers were waiting for his modality coding. The 
speaker then used an indirect-form sentence-ending in haste. If he 
uttered the testimony with a direct sentence-evidential form such as 

264 


deshita (direct form copula) in "sono jiki wa shinbun ni happyoo ga 
atta ato deshita" (The contact with Teikyo was performed just after 
the newspapaer announcement.), the testimony linguistically presents 
'realis' from his epistemic viewpoint, which was not presented in his 
testimony above. 

There are other examples of the use of indirect evidentials for the 
speaker's own information. In (5-53) the speaker used a conjecture 
auxiliary, deshoo (probably) in talking about his own feeling about 
choosing his first job after graduation. The event happened a long time 
ago, therefore, I got the impression that the memory has become 
"distant" enough to the speaker himself to let him use an indirect 
evidential. In the course of this discourse, generally the speaker 
presented a retrospective view of the early stage of his life. He also 
added a confirmation -ne to pretend that the information is shared by 
the hearer as a common understanding resulted from the previous 
discourse (i.e, "as you might imagine"). 
(5-53) 
M13: iya yoosuruni ne, suugaku toku rika toka sooiu shiken

 well, in short RAPP math. etc. science etc. like exam

 ga attatra moo zettai dameda-tte iu ki 

NOM exist(COND) EMP absolutely no-good-QUOTE feeling 

ga atta-n desu yo. de benkyooshitenai kara 

NOM existed-n-COP(FOR) VOC then studied(NEG)(STAT)because 

nanka muudo de nanka koo sukida-tta-ra haireru 

somewhat mood INS somewhat just like like(COND) enter(POT) 

mitaini amai kimochi de ita-n-deshoo ne.. 
like that optimistic feeling I was-n-probably(AUX) CONF 

265 


M13: In short, [I felt like] I absolutely wouldn't do well if there 

were exams such as math or science. Then, I hadn't studied, 

so, if I liked [the job], somewhat, I would be able to get the job.

 I probably was feeling in such an easy way (as we both know). 

PUBLIC SPEECH AND DIRECT EVIDENTIAL FORMS 

Another case that presents unanimous usage of direct forms is 
public talks (in which a speaker talks to the public). Public talks, 
which are usually one-way information transmission, showed two major 
characteristics. First, in talking to the public, a speaker does not have a 
specific audience so naturally he is not concerned with the same sort of 
interactional modalities such as interactive sentence-final particles and 
Group (3) or Group (4) type evidentials to show respect to the hearer's 
knowledge. As a matter of fact, I listened to a public speech conducted 
at a pharmaceutical company's media briefing meeting, and found only 
one sentence-ending mode for the entire two hour conference; the 
modality coding was polite direct-ending. This is a natural consequence 
since it was spoken to the public where the speakers did not need to use 
interactional sentence-endings. In this meeting, the speakers talked 
about the history and status quo of their products, so naturally all 
propositions were in the speakers' information territory as their 
professional knowledge; therefore, indirect sentence-ending forms 
were not used. This is the second characteristics of public speech. 

Data from news-shows indicate that speakers consider their 
propositions to be in their own information territory as professional 

266 


knowledge. However, this does not mean that those speakers consider 
that they have the most privileged, primary access to the information. 
The G (10) type evidential, "I think" was unanimously used in public 
talk situations in the form of omoware-masu (it is thought that...). 
Omoware-masu is a passive-voice but is still a direct-ending (i.e., -masu) 
being consistent with overall reporting modality. It appears that newscasters hesitate to say omou (I think) straightforwardly since strong 
subjectivity of omou does not match with their role as information-
transmitters. In this sense, newscaster type speech behavior is not 
personal but only a "role-based" behavior. In the following, [5-54] 
shows that (A) type propositions are dominant in public talks, and [5-55] 
indicates, accordingly, that Group (1) type direct evidentials were most 
frequently used in this discourse genre. These features of news-caster 
talks as a "register", a "occupational register" in particular, are also 
seen with teachers, doctors, etc. (e.g. Cazden, 1988). 

[5-54] Occurrences of sentences with each proposition type in "public 
speech" discourse 

PROPOSITION TYPE OCCURRENCE 

(A) The speaker's territory information 238 (59%) 
(B) The speaker's territory, the hearer's knowledge 11 ( 3%) 
(C) Both parties' territory 51 (13%) 
(D) The hearer's territory information 24 ( 6%) 
(E) The hearer's territory, the speaker's knowledge 10 ( 2%) 
(F) Out of both parties' territory 56 (14%) 
(G) Public information 8 ( 2%) 
(H) The speaker's self talk 3 ( 0%) 
Total 401 

267



[5-55] Occurrences of sentence-ending forms by groups for public 
discourse, for combined discourse types 

ENDING-FORMS All types of proposition 

Group (1) (direct) 223 (55%) 
Group (2) (D rapport -ne) 45 (11%) 
Group (3) (SD tag question.) 9 ( 2%) 
Group (4) (DQ direct but questioning) 34 ( 8%) 
Group (5) (SD sharing ne#) 18 ( 4%) 
Group (6) (Question forms) 28 ( 6%) 
Group (7) (ID inference) 12 ( 2%) 
Group (8) (ID hearsay) 21 ( 5%) 
Group (9) (Auxiliary) 3 ( 0%) 
Group (10) (ID I think) 8 ( 8%) 
total 401 

Each speaker of public speech has a different speech style, but 
generally, direct modality coding is used in talking to the general 
public. Evidentials that appeared in other genre of proposition types 
besides (A) are from the "insider-communication" part of public speech. 
As seen, a news show, for example, is often presided by multiple 
speakers (newscasters) and they talk on and off with their colleagues. 
This insider-talk kind of speech situation provided an occurrence of 
interactional hearer-sensitive sentence-endings in the data. Also, if a 
show is broadcasted from multiple places with speakers who are 
dispatched to places other than the broadcasting station, to show local 
conditions, these "dispatchers" often talk with hearer-sensitive 
evidentials such as Groups (2), (3), (4), (5) and (6). That they often have 
something visible in front of themselves to show and describe seems to 
make the speaker's modality coding more interactive: they have direct 
evidence of their proposition. Furthermore, they have specific hearers 

268 


in the station in addition to the general public. These factors may 
influence the dispatcher's psychology of talking and make his talk 
different from that of reporters at TV stations. 

COURT CASES AND DIRECT EVIDENTIALITY 

The Japanese government sometimes conduct court-like 
proceedings called "Shoonin-kanmon" (summoning witness) when a 
serious violation of a law or citizen's rights is suspected inside national 
governmental bodies or related private areas. Usually it is very difficult 
to prove that a crime occurred in the governmental system since 
government activity is quite secretive. In order to decide the possibility 
of existing criminal acts, Shoonin-kanmon is sometimes conducted 
within the Diet and/or the Parliament. Informants of the case are 
summoned and required to testify under oath, and if untruthful 
testimony is given, they may be prosecuted for false testimony. This is 
not exactly a "court" (rather "pre-court" proceedings), but the purpose 
and system are fairly similar. The data showed a part of the Shooninkanmon for the Yakugai-AIDS case (the case of patients that acquired 
AIDS from medical treatment). In the data, the questioners (diet 
members) typically used direct forms in talking about the every aspect 
of the case. Naturally, the testimony itself belongs to the witness's 
information territory, but it is understood that the known facts are also 
in the questioners' information territory. So the questioners used the 
codings of shared information for a large proportion of their 

269 


utterances. However, for the person testifying, the information is his 
own, and the discrepancy between their understandings appeared in 
sentence-ending evidential forms. The pattern of this discourse type is 
unique in that this does not happen in regular daily conversation. 

The next table shows the occurrence of sentence-ending forms 
by the two groups. Both sides used Group (1) direct-type forms for a 
large proportion of their utterances suggesting that there is 
consciousness of direct evidence in this speech situation. 

[5-56] Occurrences of sentence-ending forms by group for courtdefendant and court-prosecutor discourse 

ENDING-FORMS OCCURRENCE 

PROSECUTOR DEFENDANT 

(G1) D direct 162 ( 50%) 243 (78%
)
(G2) D rapport -ne 13 ( 4%) 8 ( 2%
)
(G3) SD tag-question. 1 ( 0%) 
0
(G4) DQ questioning direct forms 10 ( 3%) 
0
(G5) SD sharing -ne 1 ( 0%) 
0
(G6) Q Question 62 (19%) 
0
(G7) ID inference 1 ( 0%) 1 ( 0%
)
(G8) ID hearsay 18 ( 5%) 9 ( 2%
)
(G9) AUX 9 ( 2%) 4 ( 1%
)
(G10) ID 'I think' 47 (14%) 45(14%
)
Total 324 310


FORMALITY OF SPEECH AND EVIDENTIALS 

In an earlier section, formal and informal (friend and family) 
discourses were contrasted to see the influence of formality on the 
speaker's choice of evidential for all proposition types. It was found 
that in talking with friends, speakers sometimes evidentially handle 

270 


some information that is outside of both parties' territories (F, and G) as 
if it were in their information territory; however, it is also noted that 
they do not treat their conversation partner's information as their own. 
On the contrary, the speakers respect the hearer's information 
territory and knowledge even in informal friend discourse. This point 
was statistically supported by sentence-ending forms for (B), (C), (D), 
and (E) information types in which information is shared by both 
parties. 

In family discourse, on the other hand, speakers did not pay 
significant respect to the hearer's information territory and knowledge 
as much as they did in friend discourse. In particular, when expressing 

(B) and (C) type propositions, which fall in the speaker's information 
territory but are shared by both parties, the speaker's sensitivity to the 
hearer's knowledge was quite low in family discourse. In the following 
[5-57], occurrences of Group (1) to Group (6) ending forms are listed for 
(B) and (C) type propositions for family and friend discourses. The 
underlined ending-forms in [5-57] are "recommended" informal 
evidential groups for (B) and (C) respectively from the model. Friends 
and family are both uchi (inside) type speech situations, but the 
speakers were obviously less assertive to their friends. Family is the 
most fundamental uchi unit, so that the concept of each other's 
information territory in family situations is different from other 
informal discourses. The use of direct forms in (C) type propositions in 
family discourse probably occurred due to the intimate feeling among 
271 


family members. Friend discourse also tended to be spoken with direct 
evidentials sometimes with (C) type propositions that fall into both 
parties' information territories. 
[5-57] Occurrences of sentence-ending forms for (B) and (C) type 

propositions in family and friend type discourses 

Proposition Ending Discourse type 
type -form family friends 

(B) type 
G(1) D direct, etc. 21% 0% 
G(2) D rapport -ne. 29% 15% 
G(3) SD confirm-ne., etc. 21% 62% 
G(4) DQ 24% 20% 
G(5) SD sharing -ne# 0 0 
G(6) Question 2% 2% 
(C) type 
G(1) D direct, etc. 31% 9% 
G(2) D rapport -ne. 16% 17% 
G(3) D confirm-ne, etc. 4% 4% 
G(4) DQuestion 33% 40% 
G(5) SD sharing -ne# 9% 16% 
G(6) Question 4% 4% 
On the other hand, for (D) and (E) propositions (i.e., information 
that falls primarily into the hearer's territory), the appropriate use of 
evidentials was observed in family discourse. In family discourse, 
Group (4) type endings occurred 43% of the time and Group (6) type 
endings occurred 20% of the time in expressing (E) type propositions. 
The result is comparable to that from the friend discourse as well as the 
sum of all discourse types. But the sum of occurrences of Group (1), (2), 
and (3) ending forms is proportionally highest in family discourse for 
even (D) and (E) information. This data implies that speakers treat the 
hearer's information as their own more often in family discourse than 

272 


in other speech situations although they pay some respect to the 
information owned by the hearers. This is natural since, as Corollary 
Two stipulates, a speaker is entitled to treat a close person's information 
as his own. 

Before data analysis, it was expected that family members would 
be "un-respectful" to each other's information territory in every way, 
but this result indicates that family discourse conforms to the general 
model to a large extent. Blum-Kulka (1990) paid attention to parentchilren discourse and said the there are three key notions in family 
politeness: power, informaility, and affect. Power difference must 
naturally be the most influencial politeness factor in parent discourse 
and makes utterances highly direct (i.e., direct speech acts in her 
study), but the informality of the speech situation mitigates the 
directness of utterance and make them non-offensive. Also the factor 
of affect was found to be very important in indexing positive politeness. 
Although the "directness" of language is viewed differently in this 
resarch, Blum-Kulka's research supports the tendency to use direct 
forms in family disocurse due to its environmental appropriateness. 

MALE VS. FEMALE DIFFERENCES 

Female speakers were expected to be more indirect than male 
speakers, as this had been clearly observed in my early study (Trent, 
1993) reflecting a stereotype both in Japanese culture and in the 
research on women's language. However, in this research, male/female 

273 


differences turned out to be less obvious. Female speakers seem to 
prefer direct evidentials more than male speakers did in some speech 
situations or in talking about certain propositions; however, as [5-46] 
shows, as a whole, female informants' direct speech was still 
proportionally less than male speakers'. 

First, in describing (A) type propositions, which can be the most 
direct, in formal speech situations, female speakers used direct Group 

(1) and Group (2) ending forms 64% (vs. male 54%) and 28% (vs. male 
27%) of the time respectively across a variety of speech situations. In 
friend discourse, female speakers used Group (1) direct forms 75% (vs. 
male, 70%) of the time and Group (2) direct forms 7% (vs. male 17%) of 
the time. However, in family discourse, the percentage of Group (1) 
forms used by female speakers (73%) for (A) type propositions is 
smaller than that of male speakers (87%), implying that male speakers 
are possibly more direct in a family speech situation. 
For (C) type propositions, which fall into both hearer's and 
speaker's territories, females' evidential behavior was much less direct 
than that of male speakers as indicatred in [5-58]. The data in [5-58] 
show that male informants were more direct than female informants in 
expressing (C) type information to which the hearer equally has direct 
access. This result may imply that female speakers are possibly more 
sensitive to the hearer's knowledge and territory. 

274



[5-58] Female vs. male speakers' use of ending-forms for (C) type 
information 

Discourse Ending form male female 
type 
Formal G(1) D direct 14% 9% 

G(2) D rapport -ne., etc. 21% 7% 
G(3) SD confirm -ne., etc. 2% 0 
G(4) DQuestion 41% 26% 
G(5) SD sharing -ne# 14% 48% 

Friend 
G(1) D direct 20% 7% 
G(2) D rapport -ne., etc. 15% 17% 
G(3) SD confirm -ne., etc. 10% 3% 
G(4) DQuestion 35% 42% 
G(5) SD sharing -ne# 2% 20% 

Female speakers' sensitivity to information shared with hearers 
was also implied in the data for (E) type propositions, i.e., information 
which falls in the hearer's territory that the speaker has some 
knowledge about. Since the total occurrences of E type information 
were small in numbers among the entire data set, a comparison between 
male and female usages cannot be considered strong evidence, but at 
least it seems that male speakers preferred direct forms more than 
female speakers in both formal and friend discourse. Interestingly, as 
[5-59} suggests, in expressing E type propositions, male speakers 
preferred Group (1)-direct, Group (4)-DQ, and Group (6)-Q type 
sentence-ending evidentials consistently for both formal and informal 
discourses. On the other hand, female speakers used Group (4), (6), and 
(8)-hearsay type endings for formal discourse, and (4) and (6) for 
informal discourse. This difference implies that female speakers may 
be more conscious of situational difference than male speakers (cf. [5

275 


58], also chapter six.) 

[5-59] Female vs. male speakers use of ending-forms for (E) type 

information 
Discourse Ending form male female 
Formal G(1) D direct 23% 2% 
G(2) D rapport -ne. 0% 1% 
G(3) SD confirm -ne., etc. 0% 2 
G(4) DQuestion 53% 44% 
G(5) SD sharing -ne# 0% 5% 
G(6) Question 15% 19% 
G(7) ID Inference 0% 0% 
G(8) ID hearsay 0% 16% 
G(9) AUX 0% 5% 
G(10) I think 0% 0% 
Friend G(1) D direct 12% 5% 
G(2) D rapport -ne 0% 2% 
G(3) D confirm -ne 0% 8% 
G(4) DQuestion 31% 60% 
G(5) D sharing -ne# 0% 0% 
G(6) Question 43% 22% 
G(7) ID Inference 0% 0% 
G(8) ID hearsay 0% 0% 
G(9) AUX 6% 0% 
G(10) I think 6% 0% 

EFFECT OF AGE - CHILDREN'S DISCOURSE 

The quantitative data viewed by age groups are calculated for 
eight different groups. However, since the informants were not spread 
out evenly over age groups nor speech situations, it was difficult to 
positively identify solid patterns by the age factor. For example, when 
talking about (A) type propositions, speakers from each age group used 
direct Group (1) evidentials frequently as shown below [5-60] . These 

276 


surface figures may imply conclusions such as that speakers in their 
30's may be most assertive in formal conversation and informal friend 
discourse but that in family discourse, teenagers were most assertive. 
However, this observation is hardly realistic due to the small number of 
informants that supplied information for each age group. 

[5-60] Occurrences of Group (1) type direct sentence-ending evidentials 

for (A) type propositions in formal, friends, and family discourse 

by each age group 

Age Formal Friends Family 
group 

10s N.A. N.A. 92% 
20s 51% 65% N.A. 
30s 75% 92% 50% 
40s 57% 89% 82% 
50s N.A. N.A. N.A. 
60s 55% N.A. N.A. 
70s 50% 66% N.A. 
80s N.A. N.A. N.A. 

For this reason, a detailed analysis of the age factor was 
abandoned except for the youngest informants; second-graders and 
teenagers. Young people seemed to talk more directly. Naturally, single 
word utterances and simple direct endings were frequently heard. 
Children's speech was direct across proposition types as well as 
discourse types. For example, in the category of school students, (A) 
type information was exclusively expressed by Group (1) type evidential 
forms (98% for second-graders, 100% for eighth-graders). As to (C) type 

277 


propositions, i.e., shared information, 100% were expressed with Group 

(1) forms by second-graders, and 50% by eighth-graders. Even (D) type 
information, which is usually expressed by question forms, was 
expressed in Group (1) type direct forms 40% of the time by eighth 
graders, although standdard question forms occurred 40% of the time. 
Using direct forms in expressing the hearer's thoughts or inner 
feelings (i.e., D-type information for the speaker) can be an expression 
of intimacy between the two parties. An example is shown below. The 
speakers are both early teenagers (brother and sister) and obviously 
have a good relationship. 
[5-61] 
F3 (1): dareka sukina yakyuu senshuu imasu . 
anybody you like baseball player exist(FOR) 
M23(2): eeto ne., jaiantsu no ootaki senshu 
well RAPP Giants POSS Ootaki-player 
(3): nanka suraidingu ga kakkoii-n-desu. 
somewhat sliding NOM cool-n-COP(FOR) 
(4): nirui ni iku toki ashi de tacchi-shinaide 
second base DIR go time feet INST touch-NEG 
te de tacchi-suru 
hand INST touch 

(5)
: 
nanka kakkoii 
somewhat cool 

F29 (6)
: 
sore-dattara daredatte ii-n da. 
it-COND anybody good-n-COP 

F3 (1): Do you have any preferred baseball player? 
M23(2): Well, Ootani player in the Giants. 

278 


(3): His way of sliding is somewhat cool. 
(4): When [he is] going to the second base, he touches the 
base with his hands not his feet. 
F29 (5): If somebody is [doing] so, [you] like anybody. 

In the conversation, F29 expressed her brother's thought in (5) 
with direct ending n-da. Judging from the whole context, F29 had no 
background information for M3's idea about his favorite baseball player 
before this conversation; therefore, utterance (5) is not based on 
hearsay or inference. Her intention in being assertive in this 
proposition is to tease her brother in his overly simplistic reason for 
preferring a baseball player. The use of a direct ending in this case 
emphasizes the close relationship between the speakers. In adult 
siblings and friends discourse, utterances like (5) would likely be said 
with question forms or DQ-type forms. 

Another example from children's data from school discourse 
provides the same kind of function of direct forms in expressing the 
hearer's proposition. In this conversation among eighth-graders, S1 
started to introduce himself to the interviewer, but other students S2, S3, 
and S4 took over the discourse: 

(5-62) 
S1 (1) : namae wa AA desu. shozoku wa... 
name TOP AA COP(FOR) Belonging to.. 
S2 (2) : go-kyoodai wa . 
HON-siblings TOP 

279



S3 (3): yoku nita otooto ga hitori 
well look alike younger brother NOM one


S4 (4): sokkuri
identical


F25(5): futagona-n-da yo ne .
.
twin-n-COP VOC CONF


S1 (1): (to interviewer) The name is AA. I belong to.. 

S2 (2): (to S1) Any siblings? 

S3 (3): (to S1) A younger brother who looks exactly 

like [you] 

S4 (4): (to S1) very alike 

F25 (5): (teacher) You are twins, am I right? 

Students S2, S3, and S4 were trying to help the interviewer, in a 
sense, by offering more background data about S1 and also to tease S1. 
The students' hearers were not only S1 but also the other students 
present, the interviewer and their teacher. However S2, S3, and S4's 
attention was still toward S1 himself, so utterances (3) and (4) are 
considered to be (E) type utterances with direct endings. The direct 
forms that appeared in this case also imply the close relationship among 
the speakers. 

One of the differences between the adult and child groups is that 
children did not use "rapport -ne ." with direct forms (i.e., Group 2 
ending forms) as much as adults did. Children's preferred direct 
sentence-final forms were simple direct endings, simple noun endings, 
and vocative sentence-ending particles such as -no , -sa , and -no. 

280 


Adults preferred ne (with any tones) probably because of the friendly 
effect that ne easily creates even with assertive direct-endings. It 
seems that this function of ne was not the children's concerns. I 
speculate that possibly when a speaker is very young, in telling (A) 
type information to others, the function of language is exclusively 
information-transmission for the child speaker. (F) Type propositions, 
i.e., outside of both speaker's and hearer's information territories, was 
also preferred to be expressed directly in Group (1) forms in 52% of the 
time by children. However, these results do not necessarily mean that 
young speakers of Japanese do not have the concept of information 
territory and evidentiality. Their concept may not have yet fully 
developed but it was observed that seven and eight years old children 
already have some understanding of interaction of speech territories. 
Their preference of directness is probably due to two factors: 
underdeveloped consciousness of information territory, and casualty of 
speech environment (i.e., high degree of intimacy among speakers). 
Observing young children, I had a strong impression that child friends 
and adult friends are different; adult friends can be intimate, of course, 
but each individual's ego is more respected in an adult relationship. As 
Brown and Levinson argued in their politeness theory (1978, 1987), an 
adult individual's ego should be respected through being free from 
imposition (negative "face-wants"). Thus, in adult conversation, 
utterances such as "No, you are not hungry" for a (D) type proposition 
(i.e. information belongs to the hearer's information territory only) are 

281 


not normal, but yet can often occur in child discourse. In eighth-
graders' data, direct-ending forms from Group (1) such as direct forms, 
n-da yo forms were found for (D) type proposition. The use of these 
direct forms for (D) type propositions appears too rude to occur in adult 
discourse. Yet at the same time, children used hearer-sensitive ending-
forms to some extent. In second-grader's data, DQ (n) daroo . forms 
appeared for (E) type propositions, Q daroo ka . and Q no . appeared in 
eighth-graders' data also for (E) type propositions. These hearer'sterritory-sensitive endings are standard forms for (E) type propositions 
in the model. However, for (C) type propositions (i.e., information falls 
in both parties' territories), Group (1) type direct endings were 
dominant in second-graders' discourse but confirmation ne . and 
sharing ne# endings were seen in eighth-graders' proposition (C) type 
utterances, suggesting that a sub-division of the speaker's territory 
information, (A), (B) and (C), is difficult to realize at younger ages. 

As expected, for (F) type information i.e., information out of 
either party's territory, children were more direct than adults: Group 

(1) type direct forms appeared in 52% of the data in students' (F) type 
discourse while Group (1) type forms occurred in 28% of the combined 
data of all types of discourse situation. However, children's 
consciousness about distant information was seen in their use of ending 
forms. For this genre of propositions, even second graders used AUX 
kamoshirenai (might be), ID da-tte (hearsay), ID mitai (seems), ID omou 
(I think), DQ n daroo . (tag-question), Q janai no . (negative question) 
282 


and other types of indirect forms or semi-direct forms, suggesting that 
they have certain awareness that some information does not belong to 
their information territory, or at least they indicated low degree of 
commitment to some proposition. 

In addition to school situations, children's discourse was collected 
from family discourse situations. Children's (ages from ten to fourteen) 
data from family discourse do not differ substantially from those of 
school students' data: their utterances for (A) type propositions were 
exclusively direct, mostly with Group (1) type endings. For (C) type 
shared information and (E) type, the hearer's information, children at 
home also used direct endings, but hearer-sensitive Q forms and DQ 
forms were also used. For (F) type information which is out of the 
territories of both parties, the most frequently used forms were from 
Group (1) type: D direct (30%), D noun (10%), and D kara , D yo, and D 
no. (4% each) are used. But at the same time, indirect forms such as ID 
omou (I think), ID mitai (appear), AUX kamoshirenai (might be) and DQ 
n da yo ne .(tag-question) were used to indicate their uncertainty about 
expressing other people's information. These hearer-sensitive ending 
forms also appeared with children's (G) type propositions, i.e., public 
information. 

Children's psychology in dealing with other people's information 
besides their own seems to be underdeveloped from the viewpoint of my 
evidentiality model and needs to be further cultivated in social 
interaction. The amount of data was small due to the difficulty in 

283 


having lengthy discourses with young informants, but I had the 
impression that young informants have a fundamental concept of 
information territory. 

EVIDENTIALITY SHIFT 

Each individual most likely has favorite sentence-ending forms 
in each proposition type and also in each group of ending-forms. But a 
speaker's set of preferred evidential forms should not be exclusively 
used across all kinds of speech situations he encounters. Some of the 
informants provided data in different discourse situations for possible 
comparison. 

Speaker F3 provided both informal friend and formal business 
discourse data (cf. Appendix H). For all proposition types, the speaker 
apparently kept her "favorites" in both discourse types with the 
difference of formal/informal grammatical forms. For the proposition 
(A), the speaker used D kara, D kedo, D n dakedo, D n desu no ne for both 
informal and formal discourses showing consistency of personal 
preference. Only in informal discourse, the speaker used vocative type 
sentence-ending particles, yo or wa yo, noun-endings and rapportivene.. (These selections conform to my model). Therefore, the speaker 
was reasonably more assertive with her own information (i.e. 
proposition type A) in informal discourse. On the other hand, the basic 
desu/masu direct form, which is most direct for type (A) propositions, 
was outstanding in formal discourse (98%) as shown in [5-63]. This is, 

284 


probably, due to the nature of the given formal discourse: business 
discussion. The speaker talked with several service providers for her 
office and management staff, so there was a power difference between 
the speaker and her hearers. Even though her speech is completely 
"formal", in the speaker's psychology the need for interactionally less 
assertive evidentials was low in talking about her own business. 
Formal "daily conversation", on the other hand, involve fewer assertive 
direct evidentials for the same type (A) propositions as we noted earlier. 
Therefore, obviously there are different genres of formal discourse in 
relation to situational features such as power, affinity, and purpose of 
discourse. Some politeness studies demonstrate that "affinity" is one of 
the most important politeness factors: higher affinity results in higher 
politeness (e.g. Brown and Gilman, 1990). In formal business 
discussions, such as F3's example, the speakers do not need to be 
'affectionate' toward the hearer due to the practical purposes of 
discourse. Certainly the same is true with courtroom discourse in which 
emphasis is not laid on affectionate interpersonal relationship between 
the interlocutors. Obviously, in courtroom discourse also, the power 
difference between the defendant and questioners is a reason for the 
tendency to use direct evidentials for prosecutor sides. These "formal" 
discourse types are all in formal language forms, but are not truly polite 
in terms of evidentiality (see chapter six on this point). 

In this study, I treated "courtroom discourse" and "public talk" 
independently as special formal discourse situations. Still, within the 

285 


genre of 'ordinary' formal conversation, since there are situational 
differences, unified quantitative analysis, which I did in this study, can 
be misleading. This is an issue that I would like to study further in the 
future. 

[5-63] F3's ending forms in formal and informal discourse for type (A) 

propositions (cf. Appendix H) 
Proposition Ending form formal informal 
type 
(A) The speaker's G(1) D direct 98% 88% 
territory G(2) D rapport -ne. 1% 11% 
G(3) SD confirm -ne. etc.0% 0% 
G(4) DQuestion 0% 0% 
G(5) SD sharing -ne# 0% 0% 
G(6) Question 0% 0% 
G(7) ID Inference 0% 0% 
G(8) ID hearsay 0% 0% 
G(9) AUX 0% 0% 
G(10) I think 0% 0% 

A difference was found with proposition (F) type utterances (i.e., 
other people's information). For (F) type propositions, in formal speech, 
F3 exclusively used indirect forms, while direct forms appeared in her 
informal discourse more than half of the time. This result complies with 
the overall analysis of evidential forms occurring for (F) propositions 
in formal and informal situations. (cf. [5-39] ) 

Speaker F5 provided three types of discourse: family, friend, and 
formal discussion. This speaker also showed difference in her 
preference of evidentials across speech situations (cf. Appendix I). In 
expressing an (A) type proposition, the speaker preferred Group (1) and 
Group (2) type direct endings with a difference in emphasis. As the 

286 


following figures in [5-64] indicate, for formal speech, the speaker used 
a large proportion of Group (2), direct plus rapport-ne . endings, to 
mitigate the assertiveness of the proposition, but in friend speech, 
Group (1) use is dominant. In family discourse, the speaker's use of 
Group(1) and (2) forms decreased from that of formal and friend 
discourse; instead, indirect forms and question forms were used more 
often. 

[5-64] F5's ending forms for (A) type propositions in formal and 
informal discourse (cf. Appendix I) 

Proposition Ending form formal informal family 
type 

(A) The speaker'sG(1) D direct 65% 92% 82% 
territory 
G(2) D rapport-ne., etc.29% 8% 7% 
G(3) SD confirm -ne. 0% 0 0% 
G(4) DQuestion 0% 0% 0% 
G(5) SD sharing -ne# 0% 0% 0% 
G(6) Question 0% 0% 4% 
G(7) ID Inference 0% 0% 0% 
G(8) ID hearsay 0% 0% 1% 
G(9) AUX 0% 0% 0% 
G(10) I think 2% 0% 4% 

This speaker seems to be less assertive to her family members 

than to her friends. In the speaker's own retrospective observation, 

this may happen because of her long-distant poorly-preserved 

relationship with her family. But the 82% rate of occurrence of Group 

(1) ending-forms in F5's family discourse is still larger than their 
frequency rate in the whole family discourse data (79%). 
For shared information (table [5-65] below), F5's behavior almost 
conforms to my model supporting the generality of the model to some 

287 


extent. The difference from the model is that the speaker preferred 
Group (5) forms ("sharing -ne#) for friend and family discourse more 
than Group(1) type direct endings which is generally preferred in 
these discourse types. 

Also that the speaker did not used Group (1) or (2) forms (direct 
or rapportive-ne.) for formal discourse, which was fairly common in 
the general model, demonstrates her low-assertiveness in a formal 
environment; however, the same speaker used the same Group (2) forms 
to friend and family situations indicating that the speaker is more 
assertive to her friends and family in talking about shared information. 

[5-65] F5's ending forms in formal and informal discourse for (C) type 
propositions (appendix I) 

Proposition Ending form formal informal family 
type 

(C) Both 
G(1) D direct 1% 5% 13% 
parties' G(2) D rapport-ne., etc. 0% 34% 13% 
territory G(3) SD confirm -ne. 1% 0 0% 
G(4) DQuestion 15% 13% 42% 
G(5) SD sharing -ne# 73% 34% 28% 
G(6) Question 4% 5% 2% 
G(7) ID Inference 0% 0% 0% 
G(8) ID hearsay 0% 0% 0% 
G(9) AUX 0% 7% 0% 
G(10) I think 3% 0% 0% 
Although the size of F5's data set was not very large, F5's data for 

(B), (D), (E), (F), (G) and (H) proposition-types for formal and informal 

discourse situations are approximately in line with the model, 

suggesting the same speaker makes an "evidentiality shift" according to 

speech situations. 

288 


SHARED INFORMATION IN TEACHER TALK 

Evidentiality rules for classroom discourse for teaching 
sometimes do not conform to the rules of the model. In the "knownanswer teacher question" in teacher talk (i.e., questions like what the 
sum of 1 plus 1?), the shared-information norm is often ignored by both 
sides. The following is an example of an IRE (initiation-requestevaluation) sequence: 

(5-66)
F26: ja kyuu senchi go miri wa nan miri desu ka?


then 
9cm 5mm top how mm COP(FOR) Q 

S: 
kyuujuu-go miri. 
95 mm. 
F26: 
kyuujuu-go miri-datte. Minna ii desu ka? 
95mm hearsay. everybody right COP(FOR) Q 

F26: 
Then, how many millimeters are equal to 9 centimeter 
and 5 millimeters? 

S: 95 millimeters.
F26: [The answer] is said to be 95 mm. Class, is that right? 
As with newscaster talk, teacher-talk is a "professional register" 
which is a conventionalized way of speaking in a particular social role 

(e.g. Cazden, 1988). Features of teacher talk have been analyzed with 
respect to the power or control in a teacher's role (e.g. Stubbs, 1983; 
Cazden, 1988; Hess, et al., 1979; Heath, 1978). Often all kinds of linguistic 
forms in teacher-talk were analyzed, and the "indirectiveness" of the 
teacher's talk was studied in relation with the teacher's authority of 
289 


imposition (eg. Hess, 1979; Heath, 1978) but attention has not been paid 
to the irregularity of evidentiality in teacher talk since the IRE form is 
taken for granted as the basic form of teacher talk. 

In the teacher-talk style of information exchange as shown 
above (5-66), even though the proposition is shared by both parties 
from the evidentiality point of view, teachers ask questions as if the 
proposition belongs to only the hearer's territory (i.e., students' 
territory), and students answer it as though the information of their 
reply is known by only themselves, not by teachers from the 
perspective of evidentiality. The reason for this classroom convention is 
evidently due to the fact that the purpose of a teacher's questioning is to 
see if the information exists in the hearer's territory and not to 
emphasize the information-sharing environment. As the proposed 
model of evidentiality suggested, to express the shared-status of 
information seems to be important for harmonious conversation, which 
is not important in teaching knowledge. 

On the other hand, teachers also sometimes incorporated the 
shared-information norm when asking the same type of question: 
evidential forms such as datta-kke? (was it such and such?-as we both 
know?), deshoo-ka? (isn't it such and such?) that involve the speaker's 
(teacher's) knowledge about the proposition. 

(5-67) 
F26: kono kurasu, gaikoku itta koto aru hito 
this class foreign country have been to (MODI) person 

290



donokurai ita-kke? 

how many was-Q 

(F26: 
How many people have been to any foreign country? -let me recall 
our shared memory) 

This utterance implies that the teacher, F26, was sharing her 
pupil's information indirectly and asking for information based on that. 
This kind of approach to students was very significant in the classroom, 
particularly for context-based subjects. In the public elementary and 
middle schools that I visited, I observed that, in the classroom, teachers 
often treat their propositions as though they were already shared by 
the students. Sentence-ending forms that belong to Group (3), (4), and 

(5) were frequently used by the teachers for this purpose as in the 
discourse (5-68). In the discourse, the class was discussing a war-time 
story. The teacher was talking about the main character who secretly 
drunk his baby brother's formula habitually even though he knew that 
the formula was the only nutrient the baby could possibly have. The 
teacher treated this information as being fully understood by her entire 
class (although it may not have been so) since the story had been 
already read by class anyway. 
(5-68) 

F25 (1): nomitakute nomitakute shikatanai wake. 
want to drink want to drink cannot help 

(2): de kono kona miruku wa nonde ii no? 
then this formula CONT drink all right 
Q


291 


(3)
: 
nonjaa ikenai-n-da yo ne . 
drink prohibited-n-COP VOC COMF 

(4)
: 
kono non de wa ikenai, demo gaman dekizuni 

this drink TOP prohibited but patient cannot 
koo 
this way 
non-jau 
drink-(regretfully) 
wake desho . 
didn't he? 
(5): dakara 
so 
non jau-n-desho . 
drink-(regret)-n -didn't he? 
(6): non jatta ato boku wa doo iu kimochi ni natta 

drank-(regret) after "Boku"TOP how-QUOT feeling DAT became 

to omoimasu ka . 

COMP think(FOR) Q 

S (7)
: 
tsurai 
difficult 

F25 (8)
: 
soo da yo ne#, tsurai, kurushii, kurushimu, 
so COP VOC SHAR difficult suffering, suffering 

dooshite 
. 

why 

F25 (9)
: 
kona miruku wa Hiroyuki ni totte daijina mono 
formula TOP Hiroyuki for important thing 

dakara 
desu ne . 

because COP(FOR) CONF 

T: (1) He could not help craving for the formula. 
(2) Then, this milk, is it all right for him to drink it? 
(3) He should not drink it, should he? (confirm) 
(4) Should not drink, but cannot control his desire, 
and he drunk 
it, didn't he? (confirm) 
(5) Then, he did drink it, didn't he? (confirm) 
(6) After he drank it, how do you think he felt? 
S (7) : Unpleasant. 
T (8): He felt so, as we know (shared). Why? 
292 


T (9): Because the formula was a very important thing for Hiroyuki 
[the baby's name], wasn't it? (confirm) 

Together with real questioning endings in (6) and (8), 
confirmationg n-da yo ne. (wasn't it?) in (3), deshoo . (wasn't it?) in (4) 
and (5), and sharing ne# (as we all know) in (8) are used to confirm the 
students' understanding of the story, but actually functioned to transfer 
the teacher's view to the students. 

The same teacher also used the confirming-ne . ending in 
asserting her opinions too. In (5-69), the teacher praised her students 
on their progress in writing and suggested their next target is context 
improvement: 

(5-69)
F25 (1): kanji no machigai toka ne.
, 


Chinese character MODI mistake etc PART(RAPP) 
okurigana 
suffix 'kana' 
no 
MODI 
machigai toka ne., 
mistake etc. PART(RAPP) 
moo 
already 
hotondo nakunatte kimashita 
almost disappear became 
ne .. 
PART(CONF) 
(2): to iuka, 
COMP say 
kyoo wa nai-n-ja nai desu ka?
today CONT NEG-n- NEG COP(FOR) Q 
(3): soo iu 
such 
koo hyoogen 
like expression 
no 
MODI 
kihontekina bubun 
basic part 
deLOC 
joozu 
skillful 
ni natte-kimashita 
became 
ne . 
PART(CONF) 
(4): tsugini kondo naiyoo-teki na bubun desu ne . 
next next contextual part COP(FOR) PART(CONF) 

293



F25 (1): Wrong Kanji writing and others, "okurigana" mistakes 

and so on are becoming less and less, aren't they?

 (2): Rather, today isn't there any?

 (3): [You] are becoming proficient in the part of those basic 

expressions, aren't you? (confirm)

 (4): Next thing to do is 'context', isn't it? (confirm)

 In lines (3) and (4), although she was stating her own opinion, 
she used the confirming-ne ending as though she was suggesting 
something everybody agreed with. This way, the teacher could appear 
to avoid giving an impression that she is pushing her opinion about the 
students on to the students themselves. 

Although there are some unique professional ways to use 
evidentials, statistically speaking, the data from the teachers' discourse 
with students were almost equivalent to the summed figure of the entire 
data for all discourse types. In comparison with the figurative data 
from formal, family, and friend discourses, the teacher's discourse was 
found to be similar to family discourse in terms of evidentiality use in 
each proposition type. 

In teacher's discourse, (A) type propositions (i.e., the teacher's 
own information) were fairy straightforwardly expressed by Group (1) 
type direct forms (75%) together with Group (2) type, rapport-ne (10%). 
This pattern is similar to family discourse in that Group (1) direct 
ending-forms were used in 79% , and Group (2) forms in 12% of the 
time. Group (1) and Group (2) type endings were also dominant in 

294 


formal and friend discourses with greater emphasis on Group (2) type 
endings. 

(B) Type propositions (i.e., the teacher's territory information, 
and the student's knowledge) were expressed mainly by Group (2) 
rapportive-ne. ending (35%) and by Group (4) DQ forms (direct forms 
seeking agreement) (23%). This result is also comparable to that of 
family discourse in which Group (1) was used 29% and Group (4) 24% of 
the time. Formal discourse had heavier emphasis on Group (4) endings 
and friend discourse preferred Group (3) ending forms for type (B) 
propositions. 
Also for shared information, proposition type (C), the patterns of 
evidential usage by teachers and family members were, again, similar. 
Direct Group (1) endings were used in 27% of teacher discourse data and 
in 31% of family discourse data. Group (4) forms were also preferred in 
both of these discourse types, 34% for teachers and 32% for family. 
Therefore, for these two types of discourse, speakers can be fairly 
"direct" or "confirming". The teacher's discourse preferred Group (6) 
type forms also for (C) propositions. Group (6) has only question forms, 
and hence this is reasonable given the environment. 

(E) Type propositions (students' territory information) were also 
expressed by question forms 35% of the time. Naturally, teachers' 
discourse has high occurrence of Group (6) question forms for (C), (D), 
and (E) propositions. The fact that 50% of (E) type propositions were 
expressed by Group (4) direct-question forms (DQ) is also 
295 


understandable for the same reason. 

In this way, besides the frequent use of questions, the teacher's 
discourse to students may have a family atmosphere as far as the data 
are concerned. 

RELATIVITY OF INFORMATION TERRITORY 

In the proposed model of evidentials, the concept of information 
territory based on corollaries has an important role. One factor that 
should be noted is that an individual's information territory can be 
relative in relation with different hearers who have different 
information territories. 

First, observe the following discourse which shows a case of 
"plural modality" to a single proposition in talking to different hearers. 

(5-70) 

F5 (1): tochi no nedan wa sukoshiwa yasuku natta? 
land MODI price TOP a little cheap became 

F16(2): (to F5) daibu ochitsuite kita, (to M5) ne . 
significantly stable became PART(CONF) 
F5 (1) : The land's price became cheaper? 
F16(2): (To F5)It became stable very much, (to M5) Didn't it . 

In answering F5(1), F16(2) showed two different moods; F16 used a 
direct ending kita (came) because the proposition (i.e., land price) is in 
her territory (as a land-owner) but not in F5's territory. But F16's 

296 


husband, M5, who shares the proposition in his own territory and has 
more information than speaker F16 due to his business, was present, so 
speaker F16 turned to M5 at the end of the utterance and her last 
modality ne. is directed to M5. This is an example of modality shift 
according to hearers which often happens in group conversation. 

In this case, the proposition, the land price, was always in the 
speaker F16's territory in talking to both F5 and M5. However, 
sometimes, a given information which belongs to a speaker's territory 
in one speech situation, does not belong to his information territory in 
another speech situation. I call this phenomenon the "relativity of 
information territory". The phenomenon is critically related with the 
Japanese concept of "uchi" (in-group) and "soto" (out-group). 
Following Corollary two, a speaker considers a certain person's 
information to belong to his own information territory also in a speech 
situation where the referent is considered as uchi event (in-group 
matter) of the speaker when the hearer is from soto (outside). However, 
in another speech situation, the same information about the same 
referent may be treated as being outside of the same speaker's territory. 
Usually in this situation, the hearer belongs to a immediate group of the 
referent (often is referent himself) and the referent becomes a soto 
person for the speaker. This is due to the relativity of the Japanese uchi

 vs. soto concept. 
In the following conversation (5-71), speaker F5 is talking about 
F3 (referent) to speakers F8 and M1. Since the referent is considered to 

297 


be F5's close friend by F8 and M1, speaker F5 is using direct evidentials
at the sentence-ending in describing the referent (F3), treating the
referent as her uchi member. In the discourse, F5 is describing her
trip to Sweden with F3.
(5-71)
F8: (1) ima doko e tsutometeru-n-desuka?


now where LOC employed-n-COP(FOR) 

F5: (2) 
AAA-tte iu seiyaku gaisha na-n-desu yo ne.. 
AAA-QUOT pharmaceutical company-n-COP(FOR) VOC RAPP 
suueeden no. 

Sweden MODI

 (3) de, mae mo suueeden no seiyaku gaisha de 
then previously also Sweden MODI pharmaceutical company 
BBB-tte iu tokoro kara AAA ni utsutta-n-desu kedo ne.
. 
BBB-QUOT place from AAA DIR moved-n-COP(FOR) RAPP


 (4) jyooshi ni tsuite ututta-n-desu.
boss to follow moved-n-COP(FOR) 
(5) Sorede itta-n-desu kedo ne.
,
so went-n-COP(FOR) RAPP 
(6) moo inaka desu yo ne# 
very rural COP(FOR) PAR(VOC) (SHAR)
F8: (7) mukoo ni sundeirasharu no? 
over there LOC live (HON) (STAT) Q 

F5: (8) 
F3 desu ka? 
F3 COP(FOR) Q

 (9) ie ie, nihon ni oosaka ni tsutomete iru-n-desu kedo 
no no Japan LOC Osaka LOC work(STAT)-n-COP(FOR) 
(10)maa, honsha ni ne . koo insentibu torippu-tte 
Well, headquarter DIR RAPP like this incentive trip-QUOT 

F8: (1) Where does [she] work now? 

298 


F5: (2) [She works] for a pharmaceutical company called AAA, a 
Swedish one..

 (3) Then, [she used to work at another Swedish company 
called BBB. 
(4) [She] moved with her boss. 
(5) Then [we] went to Sweden. 
(6) It is very rural, isn't it?
. 
F8: (7) [Does she] live over there? 
F5: (8) [Do you mean] F3? 
(9) No, no [she] works in Japan, in Osaka.. 
(10) Well, it was like her incentive trip to the headquarters. 
In this discourse, although speaker F5 used sentence endings 
with interpersonal functions for the sake of the hearers, the referent is 
always described with direct forms indicating F5 considers F3 as being 
in her information territory. 

On the other hand, in the next discourse, the same speaker talked 
about the same referent (F3) but the speaker used indirect forms to 
describe F3 because the speaker was talking to F3 herself this time. In 
both (5-71) and (5-72), speaker F5 mentioned the fact that F3 (referent) 
lived in Osaka but evidentiality of the utterances was different between 
the cases. 
(5-72) 
F5: (1) F3-tte, oosaka ni sunderu janai. 

F3-QUOTE Osaka LOC live(STAT) aren't you (CONF).

 (2) Oomu no jiken, atta-n-desho. Oosaka de.
Aum POSS cases wasn't it? Osaka LOC
F3: 
(3) soooo, moo, chuushaki. pon to hito tare yo. 
so well syringe ONOM one drop PART(VOC) 

299 


sore de shin-jau no yo.
that INS die-(regret) PART(VOC) (VOC)


F5 (1): You live in Osaka, don't you. 
(2): [You] had Aum problems in Osaka, didn't you? 
F3 (3): Yes, [they] dropped [poison] by syringe and a drop of 
[poison] was enough to kill people. 

In the above discourse, speaker F5 used semi-direct evidential, 
-janai., in saying that F3 lived in Osaka which was well-known to 
everybody present. This time, the fact that F3 lived in Osaka was not 
considered as being in speaker F5's information territory, which was 
quite so in the previous discourse. 

The next case of relativity of evidentiality was found in a TV 
interview. A female interviewer, F23, had two different hearers: the 
public and the person she was interviewing. In talking to the public, 
F23 treated her interviewee's information as her own information: 

(5-73) 
F23: (to public) karoora tuu no komaasharu, hajime wa 
Corolla II MODI commercial at first CONT 

hachi miri video de konna fuu deshita.
8mm video by like this COP(PAST)(FOR)


F23: (to public) The commercial of Corolla II was like this at the 

beginning in 8mm film. 

F23's interviewee was a commercial-film producer, and she and the 
interviewee were talking about his TV commercial film for Corolla II. 

300 


In the above utterances, F23 used a direct evidential, deshita (was), 
suggesting that the interviewee's information was on her side when 
talking to program viewers. However, when talking to the interviewee, 
the same proposition, the Corolla II promotion video, was treated as the 
hearer's information which is shared by the speaker as in (5-74): 

(5-74)
F23: (To the producer) dekiagaru-to koo naru-n-desu ne
.


 completed-COND this became-n-COP(FOR) (CONF) 

F23: When it (CF) is done it looks like this, doesn't it? 

F23 and the producer of the film were both watching it (i.e., direct 
experience), but F23 used semi-direct evidential form, nee ., and her 
linguistic attitude toward the proposition showed more distance than 
that of in (5-73) . 

Thus, the qualification to determine whether or not an individual 

owns information seems to be relative to the speech situation. This does 
not require revision to the Corollary Two regarding the speaker's 
information territory, which is shown again below: 

COROLLARY 2 (speaker's information territory): 

A speaker's information territory contains the following three 
major types of information: 

(a) information obtained through the speaker's direct experience; 
(b) information 
about people, facts, and things close to the 
speaker, including information about plans, actions, and 
behavior of the speaker or other people whom the speaker 
considers to be close, and information of places with which the 
301 


speaker has a geographical relation; 

(c) information embodying detailed knowledge which falls within 
the speaker's area of expertise (professional or otherwise). 
(d) information which is unchallengeable by the hearer due to its 
historically and socially qualified status as truth. 
It should be noted the concept of "people, facts, and things close to 

the speaker" in qualification (b) is relative to the hearer. This leads to 

another corollary of evidentiality: 

Corollary 4 (Relativity of information ownership): 

The psychological distance between the proposition and the 
speaker, which is a condition qualified by (b) of Corollary 2, is 
relative depending on the distance between the proposition and 
the hearer as stipulated in condition (b) of Corollary 3, in such a 
way that a certain proposition could be regarded as belonging to 
the speaker's information territory when it is told to hearer A, 
yet when told to hearer B, the same proposition could be 
considered to fall in hearer B's information territory, rather 
than the speaker's due to hearer B's relative closeness to the 
proposition. 

MULTIPLE SENTENCE-ENDING MODALITY FOR INDIRECT SENTENCES 

With proposition types (F) and (G), i.e., information from other 

people's information territory, and sometimes with (E) type proposition 

(the hearer's information), as the model suggested, standard speakers 

used hearsay (Group 8) and inference (Group 7) sentence-ending 

evidentials frequently as well as evidentials of subjective judgement 

(Group 9 and Group 10). Sentences with these modalities may be 

considered to be syntactically indirect in that the proposition part is an 

302 


embedded S-bar sentence which is "enveloped" by a matrix 
verb/adjective/copula-type phrase which is lexically indirect. It was 
observed that these indirect sentences with (E), (F), and (G) propositions 
often have additional semi-direct type evidentials at the sentence end 
resulting in plural modality. For example, speakers used janai? (doesn't 
it?), ne# (as we know), yo ne. (direct vocative + confirmation) and 
other hearer-sensitive semi-direct endings with indirect mitai da (it 
seems that...) which results in mitai-janai? (it seems...doesn't it?), mitaiyo-ne. (it seems...am I right?) and so on. Although it was advocated that 
the sentence-final modality marking presents the governing modality 
of the sentence, it must be reasonable to think that some sentence-
endings have multiple-modality of the combination of semi-direct and 

indirect codings; therefore, both indirect and direct evidentials that 
occurred together at the same sentence-ending were counted in the 
database for this research. 
The multiple-modality-ending with Group (7) to Group (10) 
indirect-ending-evidentials occurred in 73% of the inference 

sentences, 47% of the hearsay sentences, 73% of the auxiliary 
sentences, and 70% of "I think" sentences (cf. appendix G). The 
observed frequency is fairly high. It seems that speakers did not want 
to end the sentences with basic forms of indirect endings (e.g. 
mitaida/desu) probably because of their concern for the hearer. 

For information that belongs to other people's direct information 
territory, data shows that for many speakers whether or not the 

303 


information is shared by the hearer takes precedence over whether or 
not the information is publicly known well enough to be told in direct 
forms. Theoretically, for less-assertive discourse, basic form of indirect 
endings are good enough for expressing (F) and (G) type propositions 
(i.e., other people's information), but pragmatically, plural modality of 
indirect plus hearer-sensitive semi-direct-type endings were preferred 
due to the interactional, hearer's-knowledge-sensitive function of 
Group (3), (4), and (5) semi-direct sentence-ending forms. This 
phenomenon seems contradictive to the fact that direct forms were one 
of the most preferred forms for (F) and (G) propositions in informal 
discourse, but it does not have to. 

Morphologically, the basic forms of the above indirect hearsay, 
inference, and auxiliary endings are still direct-endings, with their 
indirectness coming from the lexical meanings. Therefore, the 
speaker's psychology which is used to mark interactional indirectness 
by suffix-forms such as sentence-final particles and tag-questions, 
seems to prefer some extra modality in addition to Group (7) to (10) type 
forms which are already indirect in meanings. In addition, from the 
perspective of sensitivity to a hearer's knowledge, basic forms of the 
indirect endings are still "declarative" in that they end with direct 
forms of indirect lexical items even if they are lexically declaring the 
speaker's low commitment to the truth value of his proposition. As has 
been argued, the speaker's consideration of the hearer's knowledge is 
an important factor of Japanese evidentiality system that is 

304 


morphologicially realized by the use of Group (3), (4) and (5) semi-
direct evidentials. I speculate that the preference of multiple 
evidentials in the sentence ending is due to the these two reasons. 

A few examples of multiple evidentials are shown below. 

(1) 
Proposition (F) and (G) type evidentials (e.g. it seems, I heard, 
probably) with Group (1) and (2) type sentence-final particles 
(rapport-ne., vocative -sa, yo, -na, etc. that are considered to be 
direct), 
In the next example passage (5-75), F1 talked about the rumor 
about how the Aum cult got the materials for their poison gas. Hearsay 
mitai and G(1) vocative -yo are used. Vocative yo emphasizes the 

speaker's intention to be interactive, instead of merely conveying 
information which he indirectly obtained. 
(5-75) 
F1: nannka tsubureta kaisha no tokoro e 

somewhat bankrupted company POSS place DIR 

shitadori shimasu yo mitai ni kuruma de 

trade-in (FOR) VOC like car by 

noritsukete katte itta-tte sooiu hanashi ga

 drive to bought-QUOT such story NOM 

ikura demo atta mitai yo. 

many exited appeared PART(VOC) 

F1: Something like, it appears that there are abundant stories as that 
[they] went to several bankrupted chemical companies by truck 
and [said] they bought everything (I am telling you). 

305 


(2) 
Proposition (F) and (G) type evidentials (e.g. it seems, I heard, 
probably) plus Group (3), (4) and (5) type evidentials such as. 
tag-question, "confirming" and "sharing" -ne ) 
In the following example (5-76), the speaker F3 was talking 
about the police chief who was shot and wounded by an Aum follower. 
She used hearsay marker -tte itta (he said) + -tte hanashi (QUOT) + janai 
(negative question). 

(5-76)
F3 : kono jiken ga kaiketsu-sareru made wa


this case NOM resolved-(PASS) until CONT 

shine-nai-tte itta -tte hanashi janai . 

die(POT)-(NEG)-QUOT said-COMP story isn't it 

F3: It is said that there was a story that he said he wouldn't 

die until the case was settled, isn't it?. 

These cases of indirect plus semi-direct evidentials may 
emphasize that lexical indirectness is not enough for Japanese speakers; 
interactive sentence-ending which indicates the speaker's will to 
involve the hearer's knowledge of his proposition seems to be 
considered more important. 

On the other hand, as noted, the speakers used a high proportion 
of direct evidentials (Group 1-forms) in expressing the same proposition 
types without considering either the outside information owners nor 
the hearers as in the case below (5-77). The topic of the following 
conversation is a rumor that Asahara Shooko, the leader of the Aumcult, was selling his hair to his followers to eat. F2 and F3 described the 

306 


proposition with direct mode: 
(5-77) 
F3(1) : Datte kaminoke 
because hair 
datte wazawaza 
even deliberately 
kami ni 
paper in 
kurunde 
wrapped(te) 
F2(2) : soo yo. Are san-man, go-man -tte 
so VOC that $300, $500 - QUOT 
F3(3) : nannka 
somehow 
ocha ni irete 
tea LOC put(te) 
nomu. 
drink 
F2(4) : senjite nomu. 
brew(te) drink 
F5(5): honto. kimochiwaruui. 

really? feeling bad 

F3(1): Because, even his hair, wrapped with paper.
F2(1): It is so. [The price was] 300 dollars, 500 dollars,
F3(3): Something like, [they] drank it with tea.
F2(4): [They] brew and drink.
F5(5): Really? Disgusting.


However, this kind of direct mode for (F) and (G) type propositions is 
found mainly with informal conversations where the degree of 
politeness is not high, otherwise the use can be offensive as explained 
in the following section. 

DIRECT EVIDENTIALS AND NEGLECT OF THE HEARER'S KNOWLEDGE 

A speaker can offend a hearer by talking about (G) or (F) 
information in a direct form (Group 1 or Group 2) that demonstrates a 
low concern for the hearer's knowledge. An example of this is from a 
formal TV conversation between M11 and F22 in the following (5-78). 

307 


Their talk had three topics: F22's trip to Europe, M11's journalistic 
activities on the Aum-shinrikyoo case, and M11's past working 
experience at carnivals. M11 was overall a very polite speaker who was 
proficient in honorifics and sensitive to F22's knowledge; however he 
used only direct evidentials when talking about the Aum case which 
obviously offended F22 and can be seen from line (17). There seems to 
be an explanation for this M11's language behavior. He is a journalist 
who was investigating the case at the time of talk and an expert 
commentary on the case on nation-wide TV shows; therefore, in his 
mind, the Aum case was "his" case. As a journalist, certainly he supplied 
information about the case to the public through his interviews and 
discussion with the indicted Aum followers. According to the Corollary 
Two, the Aum-case is certainly in M11's information territory as 
professional knowledge; however, at the same time, laymen's knowledge 
level about the case was very high at the time. So the topic is a (G) type 
proposition for everybody, and M11's failure to acknowledge F22's 
knowledge about the case seemed to eventually offend F22. The original 
Japanese transcription is in note 4: 

(5-78) 

F22: (1) For example, the suspect Joyuu, Is "suspect" all right?, 
This time, there was a court case, wasn't it? 

(2) Having seen it, 
(3) What do you think about that? 
(4) Have you seen it? 
308 


M11: (5) Yes. Before he was arrested, I argued with him a few times, 
and also interviewed him. 

F22: (6) 
You did interview him a lot, as we all know. 

M11:(7) Yes, yeah, and, well, I saw him in court, and 

(8) 
He did not admit his guilt; he spoke about a sort of religious 
intention that he would follow his cult leader, Shookoo 
Asahara. 
(9) I was disappointed by him; I thought I saw the worst of him. 
F22:(10) What kind of person is that man, we wonder, don't we? Do you 
think he thinks that way really? 
M11:(11) Well, he is slightly different from the others in that he was not 
indicted as a suspect for the Sarine case or the murder cases 
and so on, but he was charged for his old false testament in a 
court case of 6 years ago held in Kumamoto, so he is arrested 
not for this Aum case.... 
(12):What was interesting was that he was the most interesting 
man among Aum leaders whom I interviewed. 
(13):I talked with Murai Hideo who stabbed an antagonist to death 
but there was no common "circuit" of conversation between 
us. 
(14):Same with the cult's attorney named Aoyama. 
(15):But only Joyuu could do ordinary conversation with us. 
(16):In that sense he was a very interesting target in investigation. 

F22 (17): Being challenged by experienced cunning journalists, that 
man, who is not more than 30 years old, had media interviews 
almost every day in that way; and anyway, he could confuse 
people by his talk, even if he was the winner of debate 
contests, usually people cannot do that good, 

M11 (18)
: 
He is a man with special talent, he is very quick in thinking 
and he was also very cool, I tell you. ...... 
M11 (19): 1990, the Aum people ran for public elections. 
(20): At the time, Joyuu was against the leader's idea of running for 

309 


the election. 

(21): So, it was said so, when I was discussing with him, so I said to 
him that I heard that he was against the election, then he 
said, of course, nobody would win. 

(22): Then we also thought nobody from Aum would win, but in that 
kind of pyramid organization, people tend to obey the leader's 
value, 

(23): Actually they did, but only Joyuu was apathetic. 
(24): That he said that nobody would win immediately surprised us 
very much. 
F22: (25): There were many other things about him like that story 
Therefore, his behavior in court surprised us. 

In this conversation, it is very clear from the utterances (1), (2), 
(17), (25), and other unquoted statements that F22 knew the topic (the 
suspect Joyuu and related stories) as well as M11 did through public 
reports by journalists including M11. However, from the evidentiality 
point of view, M11 did not show his acknowledgement of F22's 
knowledge about his topic, and continued to use direct-form (Group 1) 
evidentials (M11's underlined sentences) because the topic is his direct 
experience. But for standard speakers who share the concept of the 
model, a direct evidential means the topic is within the speaker's 
territory and is not known by the hearer. F22 politely used Group (4), 
Group (5), and Group (6) type evidentials at first, but started to use 
assertive direct evidentials herself starting from line (17). It seemed to 
me from her attitude that she was gradually becoming unconfortable 
with M11's use of direct evidentials. F22 tried to express that the topic 

310 


was shared between herself and M11 in lines (6) and (10) by using the 
evidentials of shared information, but M11 failed to acknowledge these 
"signs". 

The assertive language behavior of F22 is considered to be a 
demonstration of her information territory or knowledge. According to 
the proposed model, if conversationalists talk about the same referent's 
same behavior or events (i.e., shared knowledge, the suspect Joyuu in 
the example above) with direct evidentials, the situation is not standard 
because it means that both sides ignore the shared-status of 
propositional information. 

The case above is about publicly known information (F or G) 
which is likely to be shared by the conversationalists. The same kind of 
conflict over territory or knowledge occurred in the speaker's territory 
propositions about which the hearer had some knowledge that the 
speaker did not recognize. A speaker often makes the wrong decision in 
this manner when talking about his professional experience in 
particular. In the following discourse (5-78), M16, M14, and F23 talked 
about M16's profession, producing TV commercial advertisement in 
which M16 seemed to be highly acknowledged. The discourse topic was 
certainly about M16's professional experience; however, at the same 
time, his conversational partners were familiar with M16's "products" 
through watching TV. Therefore, M16's propositions were not solely 
information that only belongs to himself: it was shared as knowledge by 
the hearers, but this shared aspect of his own experience was ignored 

311 


by M16. 

In the following conversation (5-79), M16 was explaining the 

strategic use of sound in his latest commercial film for a brand of beer. 

In the film, a famous actor was attending a wedding ceremony as a guest 

and reviewing his speech on a sheet of paper which he put it down on 

the table to drink the beer product. M16 was asserting that muting the 

background music at the same time that the actor put down the sheet of 

paper draws TV viewer's attention to the paper itself (which will 

disappear a moment later in a significant fashion to advertise the beer): 

(5-79) 

M16: (1) then, when [I] thought it needs to be easier to understand in 
some way, music is stopped when the paper is put down. 

(2) So the music, stop like this, [I] stopped the music. 
(3)Then, the point of putting down the paper 
is conveyed [to the 
viewer]. 
(4) Therefore, to use the sound in that way 
will make [a film story] 
very easy to understand. 
F23: (5) The sound of beer bottle being put down [on the paper] also 
changes in an outstanding way, doesn't it? 
M16: (6) Therefore, the sound of putting the bottle down makes the story 
go ahead. 
(7):Therefore, if the background sound did not stop at that point, 
the viewpoint of the TV-viewers in the living room only goes to 
the actor's face. 

(8) He 
is an actor, so [the viewpoint of the "living room"] goes only 
there. 
(9) Because it (i.e., actor's face) 
is the most interesting, the most 
interesting thing on the screen. 
312 


(10) [I] do the things like this [in this film] to make the story 
understood by the viewers unconsciously, and let them see 
what (I) want them to see in only 15 seconds. 
(11) Unconsciously, the "living room" is a professional on TV, 
(12) Well, on average, [they] watch TV for 3 to 4 hours a day, 
(13) Since [they] 
have been doing for a few decades, the "living 
room" is a professional on TV better than anybody else. 
The English translation of this discourse may sound normal 
(Japanese original transcription is in note 5) but in Japanese, speaker 
M16's behavior is problematic from the viewpoint of evidentiality. In 
this discourse, speaker M16 used only Group (1) direct forms and Group 

(2) rapport-ne. evidentials. So he behaved as though the entire topic 
was his own information, unknown to his hearers. However, it should 
have been noted by M16 that the two people he was talking with had 
watched this commercial film on TV a number of times so that they too 
had knowledge about it. In this sense, in (5-79) utterances (1) and (2) 
should have involved evidentials of shared-knowledge because the 
hearers knew that the music meaningfully stops at a certain point in 
the CM. M16 mentions "ochanoma" (lit. living room, i.e., "viewers in 
the living room") from (7) onwards as the strategic target of the film 
production, and expressed his analysis of "ochanoma" psychology, 
however, his hearers are also "viewers in the living room". Therefore, 
his analysis of the viewers, as expressed in utterances (7), (8), (11), (12), 
and (13) should have included shared-information evidentials (or even 
hearers' territory evidentials). Also, in the process of explaining 
313 


things, generally, some kind of mutual understanding based on topic-
related common sense, or common sense regarding the process of 
discourse development, is generally established between the speaker 
and the hearers. For example, utterance (3) is a natural consequence 
from the previous utterances, so the proposition is readily understood 
by rational, intelligent hearers. In this sense, utterances (4), (6), (9), 
and (10) also needed to be coded to some degree with shared-information 
evidentials. For these reasons, M16's evidentiality coding was not 
satisfactory as a Japanese discourse by the standard of my model. 
Actually, after observing this speaker for 30 minutes, it became obvious 
that this ignorance of the hearer's knowledge and common sense is part 
of the speaker's speech style. Evidentiality data for M16's speech shows 
that the speaker used polite forms and interactive sentence-endings 
consistently (cf. appendix J). On the other hand, the type of evidentials 
that he used were very limited within Group (1) and Group (2). The 
speaker did not use any evidentials from Group (4), that is, the 
evidentials for completely shared information. M16 is much younger 
than his male hearer, M14. M14 did not seem to be offended but looked 
somewhat amused by M14's "young-generation-style" speech behavior. 
The discourse was formal, but half way through, M14 started to use 
casual plain-forms on and off in talking to M16. This behavior of M14 
may have been a reaction to M16's direct discourse style. M16's use of 
evidentiality codings may be universal among professionally successful 
young people: Relatively inexperienced in society with a strong 

314 


persona affiliated with certain professions may make a speaker act as 
though he is always "on stage" as the expert of his proposition. Also, it 
is highly conceivable that a change in "norm" may be in progress in 
the young generation regarding evidential marking to the direction of 
allowing a more "direct" and assertive pattern of usage. 

EVIDENTIALS OTHER THAN SENTENCE-FINAL FORMS 

In addition to the sentence-ending modality, there are other 
linguistic techniques to make a sentence less-assertive. Some of these 
devices can also be called evidentials. I call these "sentence-medial 
evidentials" although they may also occur at the end of the sentence in 
some cases. Most of these evidentials are 'lexically' less-assertive. I will 
discuss only some of them as examples. These sentence-medial 
evidentials are beyond the scope of this study but how this kind of 
evidential relates with sentence-ending evidentials is briefly 

overviewed.6 

(1) 
The first category is adverbs. The popular less-assertive adverbs 
include: tabun (probably), nanka (something like), dooyara (it 
somewhat appears/seems), doomo (it appears/seems), osoraku 
(probably, possibly, presumably), chotto (a bit) moshikashite 
(possibly) 
(2) The following examples contain quotative evidential forms 
with to iu ka, to ka, yoo/soo-na (adj.), soo-iu, tari (adv.), kanji, 
(noun) and others which, more or less, mean 'something like': 
(5-80) 
F5: chotto shoohisha ga bakani-sareteru-tte iu ka, 
a little consumer NOM fooled-(PASS)-QUOT 

315 


soo iu kanji

 such QUOT feeling 

F5: I had a feeling which should be said something like 

consumers are fooled. 

(5-81) 
F22:(1) NHK nanka mo anaunsaa wa sugu soku 
NHK like also announcer TOP soon immediately 

chihoo-tte iu kanji desu mono ne .. 
local area-QUOT feeling COP(FOR) (VOC) (RAPP) 

(2) chiho-tte iu kalocal area-QUOT or 
tookyo igai 
Tokyo except 
no 
MODItokoro de 
place LOC 
zuibun jyuunen ijoo iru hito 
fairly 10years longer stay person 
mo ite 
also exist 
motto kamoshirenai 

longer might be 

tookyoo e kaette kuru hito to ka ne.. 

Tokyo DIR return come person etc. RAPP 

F22: (1) In NHK (National Broadcasting Association) also it 
is something like that announcers go to local areas 
soon (after joining NHK). 

(2) There are people who stay in 
somewhere like, "local 
area" or places other than Tokyo, more than 10 
years, or might be more. 
(3) Then there are people who come back to Tokyo 
or 
people like that. 
(5-82) 

M12: sokode koo iu nyuusu ga aru-tte wakarimasu kara 
then this kind of news NON be-QUOT know(FOR) ABL 

maa sokode nanika motomer-are-tara koo iu 

316 


well then something ask-(PASS)-(COND) like this 

koto o ioo-tte iu no wa chotto kangae-tari 

COMP NOM say(VOL)-QUOT COMP TOP a little think-etc, 

memo-shi-tari-tte iu no wa arimasu ne. 

take notes-etc-QUOTE COMP CONT exist PART(RAPP) 

M12: I can find then what news we will have [tonight on the 

show] so at that time, I prepare for my opinion; thinking or 

taking note or do something else in case I am asked for. 

(3) Passive voice is also effective in making the sentence less-
assertive by creating a distance between the speaker and the 

proposition: 

(5-83) 

F18: shikashi izure no jidoo ni mo ketsuben no 
but any pupil LOC blood stool MODI 

shoojoo wa deteorazu imano tokoro 

symptom CONT appeared(NEG) at the moment 

byoogensei daichookin ooichigoonana to no 

virus colon 0157 with MODI 

kanren wa usui no de wa nai ka to mir-arete imasu. 

relationship TOP weak COMP NEG Q COMP see(PASS) (FOR) 

F18: 
However, no pupil has the symptom of blood in his stools; 
now it is considered that they are not related with O157 
virus. 

(4) Verb te-form plus verb shimau (lit. to finish) is a 
conventional phrasal form that means something has been 
already (often regretfully) done or finished. 

It may be an evidential form in that it connotes that something has 
317 


been done without the speaker's initiative. So the speaker is 
certain that something has been done but implies that not he but 
somebody else or something beyond his control is the responsible 
party for the doing (although he often is). Vte-shimau is one of 
the examples of lexically less-assertive verb phrase. Often a 
speaker uses this VP to explain his actions without being too direct 
as in (8-84). 

(5-84) 

M16: (1) de soshitara kyuuni bazaaru de gozaaru 
then then suddenly "bazaaru de gozaaru" 

-tte dete-shimatta-n- desu ne.. 

-QUOT came out (finished)-n COP(FOR) PART (RAPP) 

(2) 
de nande bazaaru de gozaaru-tte dete-shimatta-n 
then why "bazaaru de gozaaru"-QUOTE came out-(finished)
daroo-tte omotta-n desu kedo, 

conjecture-QUOT thought-n-COP(FOR) but 

itta totan moo tanoshikute shikata nai-n-desu ne.. 
said moment already happy cannot help-n-COP(FOR)RAPP 

M16: (1) Then, after that, suddenly, (phrase) "Bazaaru de 
Gozaaru" (voluntarily) came to me. 

(2) Then, I wondered why "Bazaaru de Gozaaru" 
(automatically) came to my mind, but, on the moment I 
said the phrase, I couldn't help being overjoyed. 
These are some examples of sentence-medial evidential forms and 

318 


there are others, however, the number of occurrences of these 
sentence-medial evidentials was smaller than expected. Fewer than 150 
users of intra-sentential evidentials were found in seven 7,000 
sentences. This figure indicates that Japanese speech relies on 
sentence-ending forms in terms of epistemic modality. 

As expected, in relation with sentence-ending evidentials, these 
types of sentence-medial evidentials are seen in sentences which end 
with direct sentence-ending forms, Group (1), (2), and (3) evidentials in 
particular. As to the proposition-type, many sentence-medial 
evidentials are used with type (A) propositions (i.e., the speaker's 
information which the hearer does not know.) It is natural to assume 
that a speaker unconsciously uses sentence-medial evidentials to soften 
the effect of direct endings in describing his own information. There 
are some cases in which sentence-medial evidentials are used in 
sentences that end with indirect forms. I assume this happens for the 
same reason as the occurrence of multiple sentence-ending modality. 
Tabun soo kamoshirenai (It might be probably so) and nannka soo 
rashii (it somewhat seems to be so) are examples. However, the 
combinations soo kamoshirenai mitai (it looks like being might be so) 
and osoraku kiita (I probably heard it is so) are very rare suggesting 
that the scope of Japanese indirectness within a sentence is not 
limitless. 

From the perspective of propositional types, most often, actually 
about half of the sentence-medial evidentials occur with (F) type 

319 


propositions (i.e., information that is out of either party's information 
territory) expressed with rather direct sentence ending forms, 
particularly the verb-ending forms of omou (I think). As observed, (F) 
type information is generally marked with indirect sentence-ending 
forms such as Group (7) (inferred) and Group (8) (hearsay) evidentials 
unless the proposition is widely accepted as publicly acknowledged 
truth. Naturally, in those cases, sentence-medial evidentials mitigate 
the directness of ending forms when describing other people's 
information. 

A high proportion of those intra-sentence evidentials occurred 
with the verbs omou (think), kangaeru (think), rikaisuru (understand), 
kanjiru (feel), kigasuru (feel) and others, in particular with omou 
(think) and its gerundive form omotteiru (think-stative). This 
phenomenon is also understandable because I think is a rather 
subjective evidential although, in this study, it is considered to be an 
indirect evidential based on the speaker's inference. A speaker cannot 
make I think syntactically indirect at the sentence-ending as in omou 
mitai (it seems I think that), therefore, naturally, other types of 
evidentials are commonly used with omou for the purpose of mitigating 
the subjective nuance of this evidential. Among the sentence-medial 
evidentials listed above, (1) adverbs and (2) quotations, are often 
followed by the verb omou and its related forms. The frequent 
expressions include: 

~ka na to omou (I think that it might be~ ) 

320 


~ja nai ka na to omou (I think that it might be~) 

~to ka omotteiru (I think so or other) 

~to ka nanka omou (I think so or something like that ~) 

~daroo to omou (I think it probably be~) 

tabun/osoraku ~to omou (I think it probably ~) 

~to iu foo na kanji o ukeru (I receive a feeling like ~) 

~to omowareru (It was thought that~) 

~to iu kanji da (It is a feeling like ~) 

EVIDENTIALITY IMPLICATURE 

It is observed that a speaker can intentionally choose certain 
sentence-ending evidential forms other than those which are regarded 
as being "appropriate" for his proposition. There are two ways to do 
this. In one case, a speaker pretends to be less certain about the truth 
value of the proposition than he actually is by choosing lower degree 
evidentials. In the other case, he pretends the opposite: He chooses an 
evidential of higher degree than his actual commitment to the 
proposition deserves. Naturally, the speaker has some motivation for 
such behavior. I call this phenomenon "evidentiality implicature" (cf. 
Grice's conversational implicature) in that the hearer will, provided he 
is a rational adult, receive some kind of message from the speaker's 
intentional breach of the rules of socially accepted forms of 
evidentiality. 

One case equivalent to "evidentiality implicature" in my model is 
mentioned by Oishi (1985) in his research of final-particle, ne . In 
analyzing speakers' usage of final particles ne and yone, he found that 

321 


sometimes a speaker's choice of sentence final particles does not reflect 
the reality of the discourse. For example, when talking about a book he 
had read, a speaker was not sure if the hearer knew or had read the 
book. In introducing this topic, he used the "sharing -ne#" (of this 
research) as if he assumed that the topic is shared by the hearer. Later, 
when discussing this with the researcher, the informant explained that 
his reason for choosing the "sharing -ne#" was strategic; he did not 
want to use the rapport-ne. (in my model) because he thought it would 
sound as if he was assuming that the hearer did not have knowledge and 
was afraid to set off his hearer's opinionated tendencies. He also did not 
want to use the questioning particle ka either, because he thought if he 
used ka, he would lose the momentum before stating his contention. So 
he used "sharing-ne# " pretending to assume that the hearer had the 
same information, and that this was a mutually shared topic. As a matter 
of fact, this strategy worked well and he was able to further develop his 
contention about the book which he suspected that the hearer had not 
yet read. The speaker in this way could elicit conversational 
cooperation from the hearer (who listened saying "Hmm, Hmm, Hmm" 
without interrupting the speaker). So, Oishi contends that use of 
sentence-final particles can change reality and constitutes reality in 
discourse. In this dissertation, I treated evidentiality as a "coding" issue: 
The evidentials "code" speaker's reality, but they do not "construct" new 
reality; however, the actual use of "coding" does not always 
straightforwardly follow the speaker's perception of reality due to his 

322 


various pragmatic intentions realized by "implicature". 

Oishi's case presents a strategic use of "evidentiality implicature" 
to allow the speaker to voice his opinion safely without hurting the 
hearer's feeling. But more often, the purpose of the implicature is to be 
simply polite or to be aggressive toward the hearer. 

Sometimes a speaker deliberately ignores the borders between 
information territories among the conversationalists and speaks as if 
some proposition which is actually out of his information territory is in 
his territory. In doing so, the speaker's disrespect of other people's 
territory is implied. The children's statements in (5-60) and (5-61) are 
examples of this implicature. F22's intentional use of direct forms in the 
latter half of (5-78]) is another example. 

Sometimes, however, in order to be polite, speakers uses less 
indirect evidentials for certain types of propositions than the standard 
evidentials, which the propositions actually deserve. Observe the 
following examples: 
(5-85) 

F8: (1) ii ouchi kawareta-n deshoo . 

nice house bought(FOR)-n-tag Q 
F5: (2) iie, zenzen futsuu no uchi na-n-desu yo. 

no at all ordinary house-n-COP(FOR) PART(VOC) 

F8 (1): You bought a very nice house, didn't you? 

F5 (2): Not at all. It is an ordinary house. 

F8 and F5 were talking about tax-returns and F5 mentioned that 
she had purchased a new house recently. Since F8 did not know this, for 

323 


F8 the proposition (i.e., F5 bought a nice house or not) is a type (D) 
proposition in that the proposition completely falls in the hearer's 
information territory and the speaker does not know anything about it. 
Therefore, the standard statement expected from F8, according to my 
model, would be a question sentence such a as did you buy a nice house? 

However, actually in stating F8(1), the speaker, F8, treated the 
information as if it was known to her as a presupposition (e.g. If you 
bought a house, it must be a nice house), and then used the evidential 
deshoo. which is basically for type (C) propositions (belonging to both 
speakers' territories) or D propositions (in the hearer's information 
territory but the speaker knows). F8's evidential implicature was made 
for the sake of being polite. A short conventional reply of agreement 
such as soo desu yo ne# (It is so, I agree) and hontoo ni ne# (It is truly so, 
I agree) which is often used by female speakers in replying to statements 
regarding the hearer's matter characteristically shows this kind of 
politeness: The speaker pretends to share information in the hearer's 
territory and also pretends that the hearer's contention can be easily 
verified with common sense. 

In the following example (5-86) of implicature of this type, 
speakers F5 (who live in America) and M1 were talking about F5's car 
about which M1 happened to be familiar even though it is an American 
car not available in Japan. F5 commented on M1's knowledge and 
pretended that M1 knew much about the car which was in her own 
information territory. Before line (2), F5 did not know that her hearer 

324 


had some knowledge about the proposition; therefore F5 treated it as in 
her information territory by introducing the name, Neon, indirectly in 
the quoted form, as a new piece of information to the hearer: 

(5-85)
F5: (1) watashi, ima, Neon-tte iu no ni notteru-n-desu


 I now Neon-QUOT NML drive(STAT)-n-COP(FOR) 

kedo ne.
. 
PART(RAPP)


M1:(2) 
neon, neon, doddi no yatsu. 
Neon, Neon, Dodge POSS car 

F5: (3) soo na-n desu. Yasukute... 
so-n-COP(FOR) Cheap (te) 

M1:(4) nihonsha taikoo-tte yatsu ne. 
Japanese care competitive-QUOTE car PART(CONF) 

F5: (5) 
soo desu. sasugani yoku go-zonji desu ne.. 
so COP(FOR) as expected well HON-know COP(FOR) PART(RAPP) 

F5(1) : I am driving a car named Neon, 

M1(2): Neon, Neon, Dodge's? 

F5 (3): Yes. it is cheap and... 

M1(4): [That] is the one which is said to be Japanese-car 

competitive, isn't it.. 

F5(5): Yes it is. As expected, you know very well, don't you .. 

In (5), speaker F5 ended the sentence with a formal direct form 
plus rapport-ne . indicating that she considered the proposition that 
"M1 is knowledgeable" belongs to her own territory as truth. By doing 
so, F5 paid M1 a compliment about his knowledge. The direct evidential 
functions here to imply that F5 treats the preposition "M1 knows very 

325 


well about American cars" which is D type proposition, as highly 
truthful. A short formulaic complimentary response to the 
conversational partner's matter is often used in casual conversation for 
the same purpose. Examples include sugoi janai. (That is great, isn't it.), 
ii janai. (That is good, isn't it.) and yatta janai. (You did it, didn't you.). 
In these phrases, a speaker pretends that the distance between the 
proposition (i.e., the hearer's matter) and himself is shorter than it 
actually is to express sympathy. This use of "evidentiality implicature" 
shows an aspect of Japanese politeness that is "positive politeness" in 
Brown and Levinson's politeness framework (1978, 1987). 

But more often, speakers used the concept of "evidentiality 
implicature" by using less direct evidentials than their propositions 
deserve. This view is supported by the statistical fact that (A) type 
information (the speaker's territory only) was more indirectly 
described in situations with higher formality. 

The relationship between evidential-coding and politeness is the 
topic of the next chapter. In the next chapter, I will argue that the 
rules in the model, both corollaries and standard evidential forms for 
each proposition, must be followed to be a polite speaker in Japanese 
discourse. Besides children, only a few adult speakers were found to 
noticeably and constantly go against this framwork. The cases in which 
those speakers whose evidentiality behavior did not follow the model 
are not always evidentiality implicature; the informants may have a 
different set of rules or understandings of the concept of information 

326 


territory. In discourse, their conversational partners showed some kind 
of reaction to this non-standard speech behavior. Examples of speakers 
who do not seem to conform to the commonly preferred selection of 
evidentials were quoted earlier in (5-78) and (5-79). There were a few 
other speakers who habitually used direct forms for non-(A) type 
propositions; they used direct forms for all information types. In every 
day life, one occasionally encounters this kind of habitually-direct 
speaker, and according to my subjective observation, these people are 
not popular in general among Japanese people. A comment that is 
frequently heard about these speakers is hakkiri mono o iu (He says 
things clearly). Hakkiri, meaning clear, straightforward, or outspoken 
does not have a good connotation in this context. People may not 
realize what is wrong with being hakkiri , but feel offended 
nonetheless. In the above critical comment, hakkiri actually means the 
overuse of direct-evidentials across all types of propositions. In this 
sense, non-conformity to the standard model of evidentiality may 
provoke social stigmatization, as discussed earlier. 

From the perspective of politeness, the most problematic feature 
in the speech behavior of speakers who do not appear to subscribe to 
the commonly accepted norms is that they did not make a distinction 
among proposition types (A), (B), and (C): information in the speaker's 
territory which the model stipulates as follows : 

(A) information that the speaker assumes the hearer does not know, 
(B) information that the speaker assumes the hearer knows, 
327 


(C) 
information that the speaker assumes also falls into the hearer's 
territory. 
A speaker fully commits himself to the truth value of each of these 
three types of propositions since they each fall into the speaker's 
information territory; however, if the hearer may know even just a 
little about it, linguistically the proposition should be treated 
differently. In this sense the hearer's assumed knowledge about the 
proposition takes precedence to the fact that the speaker knows the 
proposition very well. For speakers who appear not to use the 
commonly preferred norm of evidentiality coding, these three 
proposition types are the same to all hearers: information which the 
speaker knows. Those speakers may speak about the information in 
their information territory in the same way to everybody without 
considering varying knowledge levels among different hearers.

 CHAPTER 5 SUMMARY 

In this chapter, I proposed a model of the Japanese sentence-
ending evidentiality system, which presents a set of widely accepted 
pragmatic usages of the evidential forms. The main arguments in this 
chapter are summarized as follows: 

(1) The model is based on the universal concept of linguistic 
evidentiality in that direct evidentials are used to express propositions 
for which the speaker has direct evidence on which to base his 
proposition so that his commitment to the proposition is strong. 
Otherwise, a speaker uses indirect evidentials (i.e., Corollary One). This 
328 


concept of direct and indirect evidence in Japanese is explained 
through the concept of the speaker's information territory: only 
propositions characterized by Corollary Two belong to the speaker's 

information territory, and thus are expressed by direct evidentials. 
Otherwise, a proposition may belong to the hearer's information 
territory or someone else's. 

(2) A Japanese speaker's evidential usage is very sensitive to his 
hearer's assumed knowledge about his proposition, i.e., the hearer has 
information about the speaker's proposition in his own information 
territory, or the hearer has mere knowledge about the proposition, or 
the hearer does not have any knowledge. Each situation requires the 
speaker to use different kinds of direct evidentials. 
(3) Based on (1) and (2) above, I have grouped propositions into 
six basic types, and proposed "preferred" forms of sentence-ending 
evidentials for each proposition type, respectively for formal and 
informal speech situations (cf. appendix D). Competent Japanese 
speakers seem to conform to the commonly preferred forms which the 
model presents. The model is, to some extent, supported by the 
evidential-shift performed by the same speakers in different speech 
situations with different formality levels. 
(4) Japanese speakers were found to use more direct forms than 
expected. However, the forms which were abundantly used were "semidirect forms" and "direct question forms". Those forms are sensitive to 
the hearer's knowledge. Basic forms of direct sentence ending (da, 
desu, masu, etc.) were not preferred except for limited situations. This 
may be the reason that Japanese speech is perceived to be very indirect. 
(5) Information which does not belong to either the speaker's or 
the hearer's territory was expressed with direct evidentials more 
329 


frequently than expected, particularly in low-formality speech 
situations. It seems that, in informal settings, the speakers were less 
concerned with the third person's information territory than their 
hearers' information territories. Information which is publicly known 
to be highly trustworthy, in particular, tends to be expressed with direct 
evidentials. On the other hand, a large percentage of the speakers still 
expressed the same kind of information with indirect forms (e.g. 
hearsay, inference), conforming to the boundaries of information 
territory. 

(6) Formal speech situations made the speakers more indirect and 
more sensitive to their hearer's knowledge than informal speech 
situations. 
However, in formal discourse, such as business discussions and 
courtroom speech, which has a significant power difference among 
speakers and a low-concern with "affect", formality did not always 
enhance the use of indirect evidentials. 

(7) The concept of the speaker's information territory can be 
relative to different hearers due to its dependency on the concept of 
uchi (inside) (Corollary Four). According to Corollary Two, in Japanese, 
a speaker is entitled to consider other uchi people's information as his 
own territory information (i.e., information with direct evidence). In a 
family atmosphere, speakers were more assertive with direct evidentials 
and even less-sensitive to the hearer's (i.e., family members) 
information territory and knowledge. Japanese grammar in general 
has a distinction between uchi (insider) and soto (outsider) in terms of 
reference and addressee. A speakers' territory is considered to include 
all uchi members' information territories. 
However, one's uchi concept can be different in each speech 
situation with different hearers from different social groups 
("relativity of the speaker's information territory"). 

330 


(8) Public speech, teacher talk, and courtroom discourse have 
different concepts of information territory in that coverage of the 
speaker's information territory is considered wider than in ordinary 
conversation due to different views on the perceived distance between 
the speaker, the hearer, and the proposition. Usually, in these speech 
situations, the speaker's information territory includes the hearer's 
information. At the same time, teacher discourse is found to be similar 
to family discourse in terms of evidentiality use. 
(9) In terms of evidential usage, female speakers were not as 
unassertive (i.e., not indirect) as expected when compared with male 
speakers. However, female informants' frequent usage of semi-direct 
evidentials suggests that they may be more sensitive than male speakers 
to the hearer's assumed knowledge. In this sense, female speakers may 
sound less assertive than male speakers. It is also suggested that female 
speakers shifted their preference of evidentials between formal and 
informal speech situations while male speakers tended to consistently 
use the evidentials of the same types in different speech situations. 
(10) Young speakers under fifteen years of age were generally 
found to be direct in expressing all types of proposition. However, the 
concept of information territory was confirmed to be developing in 
seven and eight year old children. At this age, the use of addressee-
oriented honorifics is observed to develop also, suggesting that the 
concept of information territory of speakers is a part of the language of 
social interaction. 
(11) Speakers utilize a system of standard (i.e., commonly 
preferred) usage of evidentials in order to be assertive or less-assertive 
by not conforming to the common forms (i.e., "evidentiality 
implicature"). Speakers often use evidentiality implicature for the 
331 


purpose of expressing higher politeness. 

Overall, I argued that usage of commonly-preferred forms is, 
more or less, pragmatically required; otherwise an individual may be 
socially indexed. However, since it is not grammaticalized, and the 
concept itself is not clearly known systematically, the usage of 
sentence-ending evidentiality is not explicitly taught to non-native 
learners of Japanese. 

332



CHAPTER 5: Note 

1 Kamio's conditions for the information in the speaker's 

territory are shown in chapter three, note 2. I quote them here again: 

(1)Information about direct experience 

(2) 
Information about personal data 
(2a) Personal information 
(2b) Geographical information 
(2c) Information about plans, actions, and behavior 
(3) Information about expertise 
2However, sometimes the relative distance between the speaker 

and the information and that between the hearer and the information 
seem to matter on the surface. But this can be explained from a different 
perspective of the speaker's information territory (Relativeness of the 
territory concept) discussed in a later section. 

3Group (10) forms (i.e., I think) were used 10% of the time by 

defendants in court discourse for (A) type propositions. This may due to 
the situational characteristics of court discourse. Except for the court 
cases, Group (4) and Group (10) forms were not used in other discourse 
types for (A) type propositions . 

4Original Japanese transcription for (5-78): 

F22 (1)
: 
tatoeba Joyuu, Joyuu hikoku ga, 
for example Joyuu, suspect Joyuu Nom 

hikoku de ii-n-desu ne. 

suspect right-n-COP(FOR) (CONF) 

(2)
: 
kono aida hikoku ga saiban de arimashita yo ne .. 
the other day suspect NOM trial LOC existed (VOC)(CONF) 

333 


(3): are go-ranninatte ikaga deshita 
that HON-watch how COP(PAST)(FOR) 
ka? 
Q 
(4): goran-ni-narimashita 
watch(HON)(PAST) 
ka? 
Q 
M11(5): hai, maa, taiho-sareru mae 
yes well arrest-(PASS) before 
wa 
CONT 
nankaika 
a few times 
yariat-tari shuzai 
debated etc data-collection 
de hanashi o shiteta... 
by talked 
F22 (6): intabyuu mo 
interview also 
zuibun nasai-mashita 
a lot do (HON)(PAST) 
yo 
PART(VOC)( SHAR) 
ne# 
M11(7): ee, hai. 
yes, yes, 
de 
then 
maa hatsukoohan no toki 
well first trial MODI time 
kare o mite. 
him OBJ watched(te) 
(8): kare ga 
he NOM 
kekkyoku tsumi o 
eventually guilt OBJ 
mitomenai de sono 
admit-NEG(te) 
asahara ni tsuiteiku to iu shushi no shuukyAsahara follow QUOT content MODI religious 
ootenina 
hatsugen 
statement 
o 
OBJ 
shita-n-desu ga ne.. 
said-n-COP(FOR) but PART(RAPP) 
(9): nannka 
somewhat 
ichiban tsumaranai Joyuu o mita 
most boring Joyuu OBJ watched 
naa 
PART(VOC) 
konnna otoko to 
such man with 
yoriatteka no ka-tte iu shitsubargued COMP Q QUOT dissaoo 
pointment 
deshita 
COP(FOR)(PAST) 
ne.. 
VOC(RAPP) 

F22 (10): ano hito wa nan na-n-deshoo . . hontooni soo 
that person TOP what kind of-n-CONJ. really so 

omotteru-n-deshoo ka . 
think(STAT)-n-CONJ Q 

334 


M11 (11): ano kare wa chotto tokubetsuna no wa 
well he TOP a little special NML TOP 

ichirenno hokano hikoku to chigatte
a series of other suspect different(te)


tatoeba chikatetsu sarin ni kanyoshita to ka 

for example subway Sarin with related etc. 

rinchi satsujin jikenn ni kannyoshita to ka 

rinch murder cases with related etc. 

soo iu tsumi de sukamatteru wake janakute 

such crime by arrested(STAT) NEG 

furui aru rorunen mae, kumamoto de 

old one 6 years ago, Kumamoto LOC 

okita saibann no gishoo o yatta to iu 

happened trial MODI false testimony OBJ did QUOT 

koto de tsukamatteru kara aru imi de oomu no 

COMP arrested(STAT) because a certain sense by Aum POSS 

ichirenno jiken no honnkenn to wa bekken de 

series of cases MODI core case CONT different case by 

sukamatteru wake desu ne.
.
arrested(STATE) COP(FOR) PART(RAPP)


(12): omoshiroi no wa ichirenno oomu no kanbu-tachi no 
interesting thing TOP top AUM POSS executives POSS 

hanashi o shuzaisiteru naka de jitsu wa Joyuu ga 

story OBJ cover(PROG) within in fact Joyuu NOM 

ichiban omoshirokatta-n-desu yo. 
most interesting(PAST)-n-COP(FOR) PART(VOC) 

shuzai de wa ne.
.
covering story CONT PART(RAPP)


(13): ano tatoeba sashi-koros-areta Murai Hideo to iu 
well for example, stabbed-killed-(PASS) Murai Hideo-QUOT 

335 


hanashi o shita-n-desu kedo amari kyootsuuno 

talked -n-COP(FOR) not much common 

kairo to-iu mono ga nai-n-desu yo ne . . 
circuit-QUOT thing NOM NEG-n-COP(FOR) PART(VOC)(RAPP) 

(14): maa bengoshi no Aoyama to-iu bengoshi no
well attorney MODI Aoyama QUOT attorney MODI


hikoku ni mo nai-n-desu ga, 

suspect with also NEG-n-COP(FOR) but 

(15): shikashi Joyuu dake wa wareware to futsuu-no
but Joyuu only TOP we with ordinary


kaiwa ga dekiru otoko datta-n-desu yo. 
conversation NOM capable man was-n-COP(FOR) PART(VOC) 

(16)
: 
sono hen ga hijooni shuzai-joo
that parts NOM very much covering story-upon


wa omoshirokatta otoko desu yo. 

CONT interesting(PAST) man COP(FOR) PART(VOC) 

F22(17)
: 
sono hen no umisen-yamasen no masukaomi o 
that areas MODI cunning mass-communication OBJ 

aite ni shite ne., mada sanjuu soko soko no 

opponent 
DAT had RAPP only 30 years old barely MODI 

hito ga aa yatte mainichi kishakaiken o shite 
person NOM that way did everyday press interview OBJ did(te) 

tonikaku kemuni-maku ni shite mo nan ni shite mo 

anyway fooled 
doing whatever doing 

iikurumeru yoona koto o yareru-tte koto jitai ga ne . 

confuse like act OBJ do(POT)-COMP itself OBJ (RAPP) 

ikura sore wa benron taikai de ichii datta-tte 

even though that TOP debate contest LOC best was-but 

nakanaka futsuu wa sonna ni ikanai desu kara. 
hardly generally CONT that much go(NEG) COP(FOR) because 

336



M11(18): are wa hontoni tokubetsu no sainoo o motteru 
that TOP really special talent OBJ have(STAT)

 otoko datta naa-tte, 

man 
was VOC-QUOTE 

hijoo ni atama no kirikae ga dekiru otoko to... 

at first thinking switching NOM capable man COMP 

soreni hijooni kuuruna otoko deshita 
ne . 

in addition very much cool man COP(PAST)(FOR) (RAPP) 

M11(19): de koo-iu hanashi o, 90-nenn ni oomu ga senkyo 
then this-QUOTE story OBJ 1990 TEMP Aum NOM election 

ni utte-deta. 

to advanced 

(20): sono toki ni Joyuu-hikoku ga Asahara ni senkyo ni 
that time TEMP Joyuu suspect OBJ Asahara to election to 

hantai-shita to iu-koto ga atta-n-desu ne .. 
opposed QUOT COMP NOM was-n-COP(FOR) PART(RAPP) 

(21)
: 
sorede tooji soo-iu-hanashi ga atta node 
therefore at that time so-QUOT-episode NOM exited because 

shuzai no ato zatsudanshitete soo-ieba 

coverage MODI later chatting(te) 

senkyo ni hantaishita-n-datte-tte iu-fuuni 

election to opposed-n-heard -QUOTE-fashion 

kiitara atarimae desho daremo hitori mo 

asked(COND) of course tag-Q nobody one person even 

toorimasen yo-tte koo iu wake desu. 

elected(NEG) VOC-QUOT COMP say COP(FOR) 

(22):de sore wa wareware mo toora-nai to omou kedo 
then that TOP we also pass(NEG) COMP think but 

aa-iu kyooso o chushin to shita piramiddo 

such leader OBJ center make pyramid 

337 


soshikida-to dooshitemo ano fukujuu-shite 

organization CONDI by all means well obey 

kare no zettaitekina kachikan ni shitagau 

his 
POSS absolute value follow 

(23): minna soo yatteta-n-desu kedo 
everyone so did-n-COP(FOR) 

Joyuu dake wa sameteta-n-desu ne 
.
Joyuu only CONT apathetic-n-COP(FOR) PART(RAPP)


(24): daremo toorikko-nai-tte sokuzani itta 
nobody pass-NEG-QUOT immediately said 

tokoro ga koo hijoo ni bokura wa oya-tto 

action NOM well very much we TOP "what"-QUOTE 

odorokimashita ne . 

surprised(FOR) PART(RAPP) 

F22 (26): sono hoka ni mo sou-iu tokoro takusan 
that other also such action many 
arimashita kedo. 

existed(FOR) but. 

(27): dakara saiban ni natte ichiban odorokimashita 
therefore, trial became most surprised(FOR)

 yo ne .. 
PART(VOC)(RAPP) 

(28)
: 
aaiu taido o totteru koto ga ne. 
such attitude OBJ take(STAT)COMP NOM PART(RAPP) 

5Original Japanese transcription of (5-79) : 

M16 (1): de dounika shite sukoshi demo wakariyasuku 
then by some way a little even easy to understand 

deki-nai-ka naa-tte omotta toki ni kore o 

make-NEG-Q VOC-COMP thought when TEMP this OBJ 

338 


koo 
this 
yatte kami o 
do(te) paper OBJ 
fuseta 
put down 
toki ni 
time TEMP 
ongaku mo 
music also 
tomeru-n-desu ne . 
stop-n- COP(FOR) PART(RAPP) 
(2): ongaku mo dakara 
music also therefore 
yamete kudasai-tte 
stop(te) please-QUOT 

koo fuseteru-tte koto o tometa-n desu. 
like this put-down-QUOTE COMP OBJ stop-n-COP(FOR) 

(3): soosuru-to desu ne., sukunakutomo kami o 
doing so-COND COP(FOR) RAPP at least paper OBJ 

oita-tte iu stoorii no kaname no pointo ga 

put-QUOT story POSS key-point NOM 

desu ne., tsutawaru wake desu ne . . 

COP(FOR) RAPP conveyed COP(FOR) PART(RAPP) 

(4): desukara oto o tsukau-to sugoku wakariyasuku 
therefore sound OBJ use-COND very easy to understand 

narutte koto na-n-desu kedo ne . . 

become COMP -n-COP(FOR) PART(RAPP) 

F23 (5): demo biiru no ton o okareru oto mo 
but beer POSS ONOM QUOT put down(PASS) sound also 

nannka pa-tte kawaru-n-desu yo ne . . 
something ONOM change-n-COP(FOR) PART(VOC)(RAPP) 

F16 (6)
: 
desukara kouyatte don to naru-to sokode stoorii 
therefore doing this ONOM sound-CONDI there story 

ga shinkoo-site-iku wake desu yo ne . 

NOM proceed 
COP(FOR) PART(VOC)(RAPP) 

(7): desukara asokode oto ga tomara-nakat-tara 
therefore then sound NOM stop-NEG-COND 

ochanoma no hito no shisen wa Hagiwara-san 

living room MODI person POSS viewpoint TOP Hagiwara(HON) 

339 


no kao shika ika-nai wake desu yo. 
POSS face only go-NEG COP(FOR) PART(VOC) 

(8): sore wa tarento-san desu kara soko shika 
that TOP talent-HON COP(FOR) because there only 

ika nai wake desu yo. 

go-NEG COP(FOR) PART(VOC) 

(9): soko ga ichiban omoshiroi, sono naka no
that NOM most interesting that within


gamen no naka de wa ichiban omoshiroi 

screen POSS inside LOC CONT most interesting 

tokoro dakara. 

part because 

(10):oto de muishikini eeto soo-iu tokoro 
o
sound INST unconsciously well such factor OBJ


nanka mite morau, soshite stoorii de 

somewhat watch-receive then story 

15 byoo shika nai keredo soo-iu koto o 

15 seconds only NEG but such COMP OBJ 

kushi-shite stoorii o wakatte moraerutte 
utilize (te) story OBJ understand-receive(te) 

koto o yatteru-n-desu ne . . 
. COMP OBJ doing-n-COP(FOR) PART(RAPP) 
. 
. 

(11) muishiki no uchi ni, ochanoma wa moo
unconsciously living room TOP very
terebi no puro desu kara ne . 

TV MODI professional COP(FOR) because PART(RAPP) 

(12) maa ichinini ni san yo jikan miru no ga 
well one day three four hours watch COMP NOM 
heikin de arimasu yo ne . 

average exist PART(VOC)(RAPP) 

340 


(13) sore o 
that OBJ 
moo 
already 
nanjuu-nen mo 
decades as much as 
tuzuketeru wake 
continue(STAT) 
desu kara 
COP(FOR) because 
moo dareyorimo 
more than anybody 
ochanoma 
living room 
wa terebi no puro na-n-desu yo 

TOP TV MODI professional-n-COP(FOR) PART(VOC) 

6One of the most important sentence-medial evidentials in relation to 
speaker's psychological territory of information is the use of deixis, 
which is introduced in chapter three, note 4. 

341



CHAPTER 6: JAPANESE LINGUISTIC POLITENESS AND EVIDENTIALITY 

So far it has been demonstrated that the situationally appropriate 
use of evidentiality marking is not grammatically obligatory but rather 
is a pragmatic requirement for competent speakers of the language. 
This issue is closely related to linguistic politeness in Japanese. In this 
chapter, I will examine how the system of Japanese evidentiality coding 
is positioned in politeness theory. 

In general, across most languages, linguistic evidentiality 
markings are primarily based on the speaker's source of information, 
i.e., the speaker's direct or indirect experience. I have demonstrated in 
this study that Japanese evidentiality marking is sensitive to both (1) 
the "owner of information" and (2) the "assumed hearer's knowledge" 
about the proposition. It is a minimum requirement, in interpersonal 
communication, for the speaker to demonstrate sensitivity to these 
factors with appropriate sentence-ending evidential forms. 

A speaker may use two techniques to be more polite than required 
through the use of evidentiality implicature: The speaker minimizes his 
our information territory, or conversely expands his hearer's 
information territory. A speaker may thus make a certain piece of 
information appear to be known more by the hearer or shared between 
himself and the speaker. 

Two integral politeness factors resulting from this concept of 
Japanese evidentiality coding are "demonstrating the speaker's indirect 
relationship to the proposition" and "demonstrating the shared nature 

342 


of the proposition". As a matter of fact, these two factors have been 
considered to be important politeness rules and strategies in classic 
studies of linguistic politeness. First, regarding "indirectness", Held 
(1992), for example, commented that "...the broad scope of polite 
behavior has also undergone a certain reduction to rational, goal-
directed behavior strategies in which the component of respect is 
almost exclusively anchored in indirectness" (p. 131). Speech-act 
theory, which introduced the linguistic aspect of politeness into the 
framework of pragmatics, is primarily based on the concept of 

indirectness (e.g. Searle 1975; Lakoff 1973a, b; Leech 1983).1 Politeness 
strategies to help other people save face through a low degree of 
imposition also enhance indirect behavior (see Brown and Levinson, 
1987:60). As Held claims, the traditional concepts of "respect" and "tact" 
have been recognized and analyzed by both Grice (1967, first published 
1975) and Searle (1975) as "theories of indirectness" in the beginning, 
and have been further re-shaped by Lakoff, Leech and also by Brown 
and Levinson(1978, 1987). 

Second, the linguistic behavior of "information sharing", in 
which a speaker demonstrates expectation that the listener has 
extensive knowledge, is relevant to Lakoff's politeness rule of "show 
camaraderie" and also corresponds to a portion of Brown and Levinson's 
"positive face" strategies (e.g. "presuppose common ground", "assume 
reciprocity"). 

343 


In the following sections, first, recent studies of linguistic 
politeness are reviewed and discussed, secondly, I propose my view of 

politeness from based on perspective of this research, and lastly I 
examine how Japanese evidentiality marking fits in the politeness 
system. 

LITERATURE ON LINGUISTIC POLITENESS 

Since the 1970's, in particular, a range of thoughts have been 
expressed in this field of study. Although they differ one from another 
in details, most of them seem to fall into one of several main 

approaches.2 For convenience, I would like to review the principal 
theories from two different viewpoints. The first approach can be 
called the "normative view" in which politeness is considered to be 
conformance to rules such as universal "pragmatic rules" or culturally 
or historically defined "social orders". The second approach, the 
"strategic view", treats politeness as a set of strategies to realize 
conversational goals. Some researchers apply a combination of the two 
approaches to politeness: both rule based and strategy based. 

Politeness as normative rules 

Before politeness was discussed within the framework of 
pragmatics, pre-pragmatic linguists paid attention to the normative 
aspects of politeness. Generally, politeness was considered to be a 

344 


common-sense concept. Lakoff was the first scholar to advocate that 
politeness can be conceived as a pragmatic rule. Earlier, in his 
influential theory of Cooperative Principle (1967, 1975), Grice has 
argued that rational people unconsciously follow four mutually 
understood principles in conversation: (1) the speaker should be as 
informative as necessary; (2) he should be precise; (3) he should be 
truthful; and (4) he should be relevant within the context of the 
conversation. Grice also associated these four major principles with a 
set of more specific maxims and sub-maxims. Grice assumed that 
"anyone who cares about the goals that are central to 
conversation/communication (e.g. giving and receiving information, 
influencing and being influenced by others) must be expected to have 
an interest, given suitable circumstance, in participation in speech 
exchanges that will be profitable only on the assumption that they are 
conducted in general accordance with the Cooperative Principles and 
Maxims" (1975: 49). Grice suggested that violation of any of the 
conversational maxims is a message to the listener that the speaker's 
utterance is to be interpreted in a manner other than its literal 
meaning. 

Based on Grice, Lakoff (1973a) argued that a person's choice of 
words and sentences reflects more than just literal semantic and 
syntactic meaning. She said that there are pragmatic rules which 
govern language use, and that people violate Grice's maxims for the 
pragmatic purpose of being polite. Lakoff proposed two rules of 

345 


"pragmatic competence": "be clear" (based on Grice's maxim) and "be 
polite". She claimed that being polite is often more important than 
being clear in conversation if the speaker wants to foster a good 
relationship with his listener: The goal of most conversations is not 
necessarily the exchange of information in the most clear and efficient 
manner possible, but rather it is often to strengthen relationship 
between participants. Lakoff delineated three rules of politeness. They 
are: (1) don't impose on other people's business (formal/impersonal 
politeness); (2) give options to the listener (informal politeness), and 

(3) make the listener feel good by telling him what he wants to hear 
(intimate politeness). The first and the second rules give the listener 
autonomy by allowing him to decide to go along with the speaker's 
conversational attempts and goals. The third rule primarily aims to 
make the opponent "feel good" by using warm fuzzies such as praise and 
compliments, or conveying a sense of equality or camaraderie. 
Leech (1983) also elaborated on Grice's theory by introducing a 
set of rhetorical principles and maxims that constrain rational speech 
behavior. Leech argued that a speaker always has social goals, and that 
in pursuing these goals he should avoid any verbal or nonverbal 
conflict. Politeness, in order to maintain harmonious interaction, is one 
of his "Interpersonal Rhetorical Principles". Interpersonal Rhetoric 
has maxims that fall in three different domains: (1) the Cooperative 
Principle, (2) the Politeness Principle, and (3) the Irony Principle. 
Leech proposed six major "Politeness Maxims", which Fraser (1990: 225) 

346 


organized as follows; 

(a) Tact Maxim: Minimize hearer cost; maximize hearer benefit. 
(a') Meta-Maxim: do not put the hearer in a position where either 
the speaker or the hearer has to break the tact maxim. 

(b) Generosity Maxim: Minimize your own benefit; maximize your 
hearer's benefit. 
(c) 
Approbation Maxim: Minimize hearer dispraise; maximize 
hearer praise. 
(d) 
Modesty Maxim: Minimize self-praise; maximize self-
dispraise. 
(e) 
Agreement Maxim: Minimize disagreement between yourself 
and others; maximize agreement between yourself and 
others. 
(f) Sympathy Maxim: Minimize antipathy between yourself and 
others; maximize sympathy between yourself and others. 
Leech further proposed five different "scales" to measure the 
degree of conformance to the maxims: the Cost-Benefit Scale, the 
Optionality Scale, the Indirectness Scale, the Authority Scale, and the 
Social Distance Scale. So, theoretically, these six maxims and five scales 
should be sufficient to "diagnose" human politeness behaviors. Since 
the word "politeness" was too "generic" for Leech, he identified four 
different types of politeness; Competitive Politeness, Convivial 
Politeness, Collaborative Politeness, and Conflictive Politeness. He did 
not, however, discuss a speaker's motivation for choosing one type of 
politeness over another. Although scholars tend to consider Leech's 
overall approach to be too theoretical to apply to actual language use 

(e.g. Fraser, 1990; Watts et al, 1992), Leech did indisputably provide us 
347 


with a detailed elaboration of Grice's concept. 

Interestingly, Leech implied that the goal of politeness is to 
establish and maintain social rules ("comity"). In this sense, his 
approach based on communication maxim leads to a "social norm/order 
view" of politeness that also sees politeness as a normative behavior in a 
culture-oriented way. A group of researchers of non-Western 
languages (e.g. Hills et al., 1986; Matsumoto, 1989; Ide, 1989; Koo, 1995) 
see politeness as a set of standard behaviors in a given society to which 
each individual is obliged to conform. In this "social norm" view, 
politeness is a social rule, part of common ground of community 
members. This view will be discussed after the strategic view. 

Politeness as a strategy 

While Lakoff and Leech viewed politeness as a regulative 
principles which govern our linguistic behavior, other scholars argued 
that speakers use politeness behavior as an interactional strategy in an 
attempt to attain their conversational goals. It is not too radical to 
assume that all human interaction is strategic to some extent in that it 
usually has goals to attain (e.g. Read et al., 1989; Pervin 1989). Politeness 

behavior is not an exception.3 Looking at the definitions of politeness 
may help to clear this idea. There have been a variety of definitions of 
politeness although there is not a single universally accepted one. 
Many researchers claim that the purpose of politeness is to make the 

348 


hearer feel good, to make the conversation harmonious and human 
relations peaceful (e.g Lakoff 1973, Held, 1992). However, under these 
"surface" purposes there may exist a speaker's intended goals which can 
be achieved only on peaceful terms. Watt et al. (1992) quoted a 
definition of politeness from "1702 The English Theophrastus: or the 
manners of the age", and paraphrased it as follows: 

Politeness is a form of social behavior encompassing both 
linguistic and non-linguistic activity; that it is a skill which, if 
acquired, is to be used in a rational, premeditated fashion to 
achieve very specific aims; that its principal aim is the 
enhancement of ego's self-esteem and his public status esteem; 
that it demands a subtle interpretation of the social context in 
which it is to be used (45). 

Watt compared this definition with the modern definitions by 
Lakoff (1975:64), Leech (1983:104), Fraser and Nolen (1981), and Brown 

and Levinson (1987:1)4, and commented that the modern definitions of 

politeness from "maxim/rule" viewpoint (e.g. Lakoff and Leech) 
interestingly do not differ significantly from the eighteenth century 
idea of politeness. He said that those definitions are lacking basically 
egocentric nature of politeness behavior, and concluded that whereas 
on the surface politeness may appear to fulfill altruistic goals, a 
communicative partner may be potentially aggressive as Brown and 
Levinson posit; thus politeness is, nevertheless and to some extent, a 
mask to conceal the ego's true frame of mind. 

P. Brown and Levinson (1978, 1987) proposed a theory of 
politeness that viewed politeness as a set of goal-oriented and situation
349 


dependent behavioral strategies. They referred to Grice's Cooperative 
Principles in that a deviation from the Principles (i.e., conversational 
implicatures) demonstrates the speaker's intention to be polite (also see 
Fraser, 1990). Their theory was based on several crucial assumptions. 
First, they assumed that politeness behavior is universal, although 
specific methods for expressing politeness may differ from one culture 
to another. Second, they also assumed that humans rationally use 
strategies which help them attain their goals, and that one of their 
goals is to mutually maintain "face" (individual's self-esteem) in 
communicative interactions. This concept of face maintenance is 
central to their framework. They adapted the notion of face from 
Goffman (1967) to whom they dedicated their 1987 book. In his book 
"Interactional Rituals", Goffman proposed that face is the ideal social 
image which an individual wants to portray of himself. Goffman wrote 
that "face may be defined as the positive social value a person 
effectively claims for himself by the line others assume he has taken 
during a particular contact" (p. 5). He also claimed that a speaker must 
be considerate as to maintain not only his face but also his interactants' 
face, and that the mutual maintenance of face is a basic feature of any 
social encounter. Brown and Levinson adopted this concept of face in 
human interaction, and assumed that there are two dimensions of 
human "face wants". They called these "negative face" and "positive 
face". Negative face is everybody's desire to be free from imposition by 
others, to have their personal prerogatives, and to maintain and respect 

350 


for their territory. Positive face represents one's desire to be approved 
of, to be appreciated, and looked upon favorably by others. Brown and 
Levinson concurred with Goffman in that rational individuals will try 
to maintain each other's negative and positive face unless there is some 
other goal which is more important than fulfilling each other's face-
wants. Furthermore, speakers must be most aware of their interactant's 
face-wants when pursuing a potentially threatening goal. Brown and 
Levinson described these "intrinsic face threatening acts " (or FTA's) as 
introducing the potential for interpersonal conflict. Therefore, 
according to the theory, the speaker will either try to minimize the 
damage which an FTA may cause, or decide not to do the FTA at all, or if 
he is not concerned with causing a conflict, he will boldly exercise 
FTA's to his hearer's face. Brown and Levinson considered FTA to be an 
important determinant of one's use of politeness. Actually, all forms of 
politeness are linked to FTA's in Brown and Levinson's theory. As the 
estimation of risk of face loss by an FTA becomes greater, a speaker will 
need to resort to higher levels of politeness strategies. The following 
chart [6-1] represents the core concept of the theory. The chart shows 
that if a speaker estimates that a minimal loss of hearer's face will be 
caused by his FTA, the speaker may perform an FTA without redressing 
it. This type of speech is Grician-maximally efficient speech: telling the 
truth straightforwardly in an unambiguous way with the least 
necessary amount of information. That is strategy (a) in [6-1]. As the 
estimated face loss increases, the speaker may need to resort to a higher 

351 


degree of politeness. Strategies are: (b) redress or make up for his 
threatening action by positive politeness strategies to satisfy the 
hearer's "positive face wants" (e.g. compliments, in-group references, 
familiar address, and sympathy); (c) redress or make his action which 
may be threatening to the hearer's "negative face wants" by using 
negative politeness strategies (e.g. minimize the size of imposition, 

dguarantee its nonrecurrence, and indirect request); ( ) do a FTA but in a 
circumlocutious or ambiguous way so that the hearer may not interpret 
it as face threatening but inefficient communication; and (e) do not say 
anything which is potentially face threatening when the greatest face 
loss is estimated, but since most communications are in some way face 
threatening (e.g. R. Brown, 1990) this strategy may result in a lack of 
communication. 

[6-1] P. Brown and Levinson's model of politeness strategies (1987:60) 

Lesser face loss 
is estimated 

(a)without redress, 
boldly 
on record 
(b) positive 
politeness 
strategies 
Do the FTA with redressive 
action 
(c) negative 
politeness 
strategies 
(d) off record 
(do FTA but ambiguously) 
(e) Don't do 

352



the FTA 
(but no communication) 

Greater face loss 
is estimated 

Commenting on P. Brown and Levinson's framework, R. Brown 
and Gilman (1989) suggested that negative and positive politeness are 
not independent of each other as P. Brown and Levinson posit. They 
said that whether a strategy is positive or negative is not the nature of 
the strategy itself but rather depends on how the strategy is used in a 
particular situation. Accordingly, they collapsed these two categories 

(b) and (c) in [6-1] into one. Baxter (1984) also found from her 
empirical studies that negative politeness strategies scored high only 
for the negative politeness ratings whereas positive politeness 
strategies scored high on both positive and negative politeness 

dimensions. Baxter, therefore, speculated that positive politeness may 
be a higher politeness strategy since it subsumed both positive and 
negative politeness strategies. These studies threw some doubt on the 
boundary in reality between the negative and positive strategies of 
Brown and Levinson's framework. As a matter of fact, in their 1987 
book in which Brown and Levinson reassessed some aspects of their 
original model, they acknowledged that the actual boundaries between 
the strategies are far less clear than their original model had implied: 
The three super-strategies, positive politeness, negative politeness, and 
off-record, which were ranked unidimensionally to achieve mutual 
exclusivity, may be used inclusively in real utterances (pp. 17-18). 

353 


In order to determine what politeness strategy should be used in a 
given situation in accordance with the level of FTA, P. Brown and 
Levinson identified three situational factors: the horizontal (or social) 
and vertical (hierarchical) distance between interactants, and the 
speaker's assessment of the probable imposition degree that a certain 
FTA would create between the participants in a particular speech 
setting. Based on these ideas, Brown and Levinson originated the 
following formula to calculate the weightiness of FTA (W) as follows:

x[6-2] 
W = D(S,H) + P(H,S) + R

xx 
In the formula, D (A, T) is the social D istance between the 
S peaker and the H eaer, P(T, A) is the relative P ower the H earer has 
over the S peaker, and R is the absolute R anking of imposition of the

xintended action (e.g. requesting, complaining, promising, and 
apologizing) in a given culture. Brown and Levinson assumed that 
D istance and P ower factors are universal determinants of politeness 
strategy and that there would be a cultural difference in the evaluation 
of R factor. This formula suggests that the seriousness of an FTA and the 
consequent need for appropriate politeness level are calculable based 
on a linear combination of three contextual factors; the social distance 
between the actors, the hearer's power, and the degree of imposition. 
The strength of Brown and Levinson's framework is its 
interactional context dependency which is advantageous when 

354 


compared with the rule/maxim views presented by scholars such as 
Leech and Lakoff. As a matter of fact, in general opinion, Brown and 
Levinson's theory excels in representing politeness as a system in broad 
outline based on the natural human desire of face-wants as well as 
potentially goal-oriented human behavior. As a whole it is a grand 
intellectual work (see also R. Brown, 1989) that has been empirically 
supported by the fellow researchers (e.g. Ervin-Tripp, 1976; Baxter, 
1984; Blum-Kulka, et al., 1985; Holtgraves, 1986; R. Brown and Gilman, 
1989; Holtgraves et al., 1990). 

At the same time, follow-up studies by various researchers have 
yielded some findings which partially dissent from Brown and 
Levinson's theory. Ervin-Trip (1976) found that requesting behaviors 
between intimates are more direct than those between strangers when 
"D" (social horizontal distance) is defined as solidarity. Interestingly, 
Wolfson (1988) found that her subjects were more polite to 
acquaintances than strangers or intimates friends; therefore, she 
reported that there is a "bulge" between two extremes of social distance; 
"very close and very distant" and "neither close nor distant". R. Brown 
and Gilman (1990) reported that when "D" is defined as "affect" (liking), 
the higher affect resulted in higher politeness. They proposed two 
components of "D", "interactive closeness" and "affect", and claimed 
that, in four Shakespearean tragedies, those two factors are not closely 
associated to each other: Interactive closeness has little to do with 
politeness. This finding suggests that the "D" factor that actually 

355 


influences the choice of politeness strategy is "liking". Field (1991) and 
Koo (1995) proposed that the D (distance) factor should be broken into 
three independent factors: "affect" (liking), "familiarity", and 
"familiarity-by-affect interaction". These reports suggested that a reexamination of the distance factor might be necessary. 

Along with these observations, the theory received some serious 
disagreement from scholars regarding its basic postulates. There seem 
to be two major criticisms. The first criticism is that the concept of 
face-wants is not universal. Since the concept is central to Brown and 
Levinson's framework, the critics contend that the universality of the 
theory is questionable (e.g. Blum-Kulka, 1990; Gu, 1990; Matsumoto 1989). 
The second argument is based on the concept of "deference". On the 
face concept, for example, a Japanese researcher, Matsumoto (1988, 
1989) claims that Japanese people's concept of face is different from 
that of Brown and Levinson's model because of group-orientation in 
Japanese society. Matsumoto's assertion that Japanese people have no 
concept of territory was introduced in chapter three of this dissertation. 
She argues that Japanese group-orientation leads to the lack of the both 
concepts of "face" and "territory". 

What is of paramount concern to a Japanese is not his/her own 
territory, but the position in relation to the others in the group 
and his/her acceptance by those others. Loss of face is associated 
with the perception by others that one has not comprehended 
and acknowledged the structure of the group. 

What is most alien to Japanese culture in the notion of face, as 
attributed to the model person, is the concept of negative face 
wants as the desire to be unimpeded in on's action. Postulating as 

356 


one of the two aspects of the Model Person's 'face', the desire to be 

unimpeded, presupposes that the basic unit of society is the 
individual. With such an assumption, however, it is almost 
impossible to understand behavior in the Japanese culture. 
(1988: 405) 
Matsumoto argues that in societies with very strong "group 

interdependency" such as Japanese society, members do not exactly 
have negative face wants. I argued in chapter three that Japanese 
group-orientation does not indicate an absence of the concept of 
territory. Again, I also believe that Japanese speaker's interdependency 
does not mean they do not have the concept of both types of face. As a 
matter of fact, as Matsumoto argues, in such a society, each individual 
may be concerned with being acknowledged by or depended on by other 
members of the group (i.e., Brown and Levinson's positive face wants) 
so that, within a social group, members may present a strong 
interdependency encouraged by intra-group positive politeness 
behavior. At the same time, as noted earlier, Japanese people typically 
do not want to be impeded by other group members (e.g. Nakane, 1967; 
Doi, 1973). This is a form of negative face wants where an individual 
represents his group's face. No matter how diverse international 
cultures are, it is likely that all human beings have both basic positive 
and negative face wants. 

At the same time, however, the Japanese case may present 
different boundaries between negative and positive strategies from 
Brown and Levinson's original model. For example, the imposition from 
a lower status speaker to a higher status addressee is sometimes 

357 


conventionally regarded as positive politeness. An example is a mother 
speaking to her daughter's teacher, that "musume o yoroshiku onegai 
shimasu" (lit. "Please take good care of my daughter") If we regard this 
behavior as attending to the hearer's wants to be depended upon, this is 
an authentic positive face strategy. In this sense, the speech could be 
considered an "imposition", which should be avoided as a negative 
strategy in Brown and Levinson's model. However, the nuance of 
imposition is mitigated with the word "please". Also the sentence 
acknowledges at a deeper level of trust that the speaker has upon the 
hearer. Such an entrusting act can be seen as a "compliment" to the 
hearer. In this way, the interpretation of the model must depend on 
each speech community, however, Brown and Levinson's model may be 
found highly relevant in any speech community. 

Regarding the concept of FTA, in Brown and Levinson's 
framework, as written earlier, every exercise of politeness strategy is 
linked with an occurrence of FTA and the corresponding needs to 
strategically redress FTA for the sake of the actor's intended goal. This 
leads to the second major criticism. It was claimed that deferential 
politeness, i.e., "ordinary everyday politeness" (Koo, 1995), seems to be 
realized without FTA and that politeness is not only strategic but also 
part of socially required normal standard behavior. This type of view 
generally considers politeness to have two dimensions: politeness as a 
social rule and politeness as situational strategy. For convenience, I 
will temporarily call this type of view the "two-dimensional politeness" 

358 


view. 

Two-dimensional-politeness view 

The two-dimensional view arose in response to the appearance of 
Brown and Levinson's theory. Non-Western researchers (e.g. Hills et 
al., 1986; Matsumoto, 1989; Ide, 1989; and Koo, 1995) argue that languages 
with honorifics such as Japanese, Javanese, and Korean have a 
different dimension of politeness: deferential type politeness. Ide 
(1989), for example, criticized all major existing pioneering politeness 
theories, by saying that all the theories "could not avoid an 
ethnocentric bias toward Western languages and the Western 
perspective" (p. 224) and that they are not adequate for languages with 
honorifics such as Japanese (Brown and Levinson, however, claimed 
their model could handle honorifics. See later section). Those Eastern 
researchers claim that speakers of languages with honorifics express 
politeness through two different channels. Ide (1989) described the two 
channels as "intentional strategies to allow his or her [i.e., speaker's] 
message to be received favorably by the addressee" and the "speaker's 
choice of expressions to conform to the expected and/or prescribed 
norms of speech appropriate to the contextual situation in individual 
speech communities" (p. 225). The first channel seems to be relevant to 
Brown and Levinson's framework if we interpret Ide's "to make message 
be received favorably" as meeting a hearers' face-wants. This type of 

359 


politeness is speaker-intentional, and is thus called "volitional 
politeness". Ide claimed that the second type of politeness is neglected 
in Brown and Levinson's framework. The second type of politeness is 
described as a socially required standard behavior to be a competent 
member of the society; therefore, if a speaker does not meet this social 
demand, some kind of social sanction may be applied. This politeness is 
called "discernment politeness" (e.g. Hills et al, 1986, Ide, 1989). 
Discernment (wakimae in Japanese) refers to the use of a standard in 
formal setting (Watts et al. 1990). Wakimae may also refer to a set of 
programmed behavioral patterns recognized as appropriate by 

community members in each social setting.5 This view of politeness 
involving socially required discernment is called the "social-norm" 
view. 

The social-norm view emphasizes the importance of deferential 
politeness which had been routinely separated from the scope of 
politeness for some reason. Hwang (1975, 1990) proposed to distinguish 
deference from politeness, since in the Korean language they are two 
distinctively different sociolinguistic concepts. This brings up the 
question of how the scope of politeness should be defined in a universal 
way. Hwang said that we may identify politeness with "sentiments" (a 
psychological matter) and deference with "conventional norms" (a 
matter of social code). Koo (1990) claimed that Korean honorific use is 
not a strategic politeness, which is to mitigate predictable effects of FTA, 

360 


but simply an expression of the speaker's discernment which occurs 
even in non-FTA situations. Matsumoto (1987) also claimed earlier from 
her experiment with Japanese speakers that even in a "non-FTA" 
interaction, the obligatory use of honorific expressions was confirmed 
in hypothetically high formality settings. However, the basic question 
which came to my mind here is how large is the set of 'one hundred 
percent FTA-free speech situations' among all speech situations we 
have. If we assume that speech behavior is goal-oriented and targeted 
at someone else who has an independent ego, its proportion seems to be 
very small or even null (also see the next section). 

Focusing on the wakimae aspect of language, researchers of non-
Western languages oppose the view that discernment and politeness are 
mutually distinct sociolinguistic aspects (e.g. Fraser and Nolen, 1987; 
Hwang, 1990); instead, they incorporate deferential aspects of speech as 
a major part of politeness, i.e. "non-FTA" politeness. 

The scope of these researchers may, at least, tell us something 
about the basic problem of defining politeness in a cross-culturally 
valid way. In some cultures, what is emphasized can be the strategic 
aspect of language, in another culture it can be the social standard form 
of language, and in another culture, it can be the principle of 
benevolent modesty from the Buddhism (e.g. Kummer, 1992). Across 
languages and cultures, we undeniably have different stresses in 
wielding our politeness behavior, and in particular how to incorporate 
discernment into the theory of politeness remains as a critical question. 

361 


Similar approaches to the two-dimensional view can be seen among 
Western scholars too. For example, Janney and Arndt posit (1992) two 
different types of politeness: "social politeness" and "tact". According to 
these researchers, social politeness is "rooted in people's need for 
smoothly organized interaction with other member of their group" (p. 
22) primarily by following social conventions (e.g. "conversational 
routines", "politeness formulas", "politeness conventions", and 
"formulaic expressions"). This social politeness seems to belong to the 
category of discernment. On the other hand, tact is rooted in people's 
need to maintain face: Janney and Arndt said that "for the preservation 
of face and avoidance of conflict, people need to behave tactfully in an 

interpersonally supportive way" (p. 23).6 This tact seems similar to 

Brown and Levinson's face-saving strategies or Ide's volitional 
politeness. 

POLITENESS VIEW IN THIS RESEARCH 

The argument of Eastern scholars in the previous section leads us 
to the question of whether or not there is a universal theory of 
politeness. If there is, this theory should be capable of satisfying a 
culturally varied concept of politeness across speech communities. 
However, cultural variance may not be as radical as it at least seems. As 

R. Brown (1990), referring to S. Asch (1952), said, "if we looked only at 
cultural features, externally viewed, we should see a high degree of 
362 


cultural relativism, but if we look at intercultural meanings in term of P 
and D [for example], we see universality or invariance" (p. 31). This 
observation may hold some truth in the deference issue of politeness 
too. Fraser and Nolen (1981) attempted to characterize deference 
through their empirical study of deferential expressions in English as 
well as Spanish. They tried to distinguish deference from politeness 
under the hypothesis that deference is not the same as politeness, but 
that the inappropriate use of deference can result in an impolite 
behavior. They found that both English and Spanish speakers 
consistently agreed on the relative degree of deference associated with 
prepared sentences. They found that the similarity between English 
and Spanish speakers in their understanding of deference in language 
to be systematic. Fraser and Nolen suggested that certain semantic 
aspects of deferential expressions may function comparably across 
languages. Koo (1995) also found that American students also expressed 
deferential politeness in a similar way to Korean students. Before the 
experiment, Koo presumed that only Korean subjects would express 
politeness in hypothetical non-FTA speech setting and that American 
subjects would only show politeness in situations with FTA. But the 
results did not match his expectation. It seems that deference (or 
discernment) is not limited to Asian languages, but is a widely shared 
social function. Hills et al. (1986) commented that both American 
English and Japanese speakers had both discernmental and volitional 
dimensions of politeness, but for Japanese speakers discernment was 

363 


prominent whereas for American speakers volitional politeness was 
prominent. So it seems reasonable to say that the difference among 
cultures is one of emphasis of one function of politeness over another 
and that it is reasonable to incorporate discernment/deference into the 
scope of politeness in cultures that have an emphasis on it. 

However, I do not agree with the view that deferential politeness 
has nothing to do with FTAs, and thus has no relation with Distance and 
Power factors in speech situations. Rather, I believe, all types of 
politeness behavior are possibly related to the speaker's strategic 
motivation to mitigate possible FTAs. Matsumoto (1989) points out that 
saying it is Sunday today to someone is not a FTA, yet Japanese subjects 
used polite expressions when higher formality is required; therefore, 
politeness can exist without FTA (also Koo, 1995). Certainly, the phrase 
it is Sunday today is not an FTA sentence since there is no "R ",

x 
however, the interaction itself can conceivably be a FTA for either 
party if there is a great contrast in power such as status difference in a 
given organization, or if the interactants do not know each other at all 
or dislike each other (great "D" and "P"). In short, we can assume that 
every instance of speech behavior has the potential to be an FTA. As 
argued earlier in this chapter, all human interaction may be considered 
goal-oriented and consequently any interpersonal speech behavior 
may be considered strategic in order to serve some purpose. 
A Japanese sociolinguist, Maynard (1989), took the position that 

364 


Japanese linguistic politeness is strategic as a whole. She claimed that 
Japanese speakers use a variety of strategies which may achieve the 
desired goal by "maximizing the effect of personal appeal" and 
"achieving maximum agreeableness to the recipient" (p. 31). Maynard 
called this whole system of the strategies "social packaging", in that 
Japanese speakers should "package" the propositional content of their 
talk appropriately with strategies. Packaging is "a socially motivated 
act to construct the content of the utterance in such a way as to achieve 
maximum agreeableness to the recipient" in keeping "interpersonal 
feelings intact when the semantic content is conveyed to the other 
interactant" (p. 31). Maynard claimed that Japanese speakers do this by 
the use of frequent final particles, fillers to hide the message, 
incompleteness to soften the statement, delaying and avoidance in 
reacting to avoid direct confrontation. Maynard suggested that these 
strategies are positive politeness in emphasizing positive aspects of 
interpersonal relationship although they may fall into the category of 
negative politeness in Brown and Levinson's model. Maynard claimed 
that the background of these speech packaging strategies is Japanese 
society as a homogeneous speech community in which members are 
assumed to hold similar or identical views (Mizutani, 1983, quoted by 
Maynard). This explanation may be true, but at the same time, this kind 
of strategic decoration of speech must be undertaken in any speech 
community for politeness in communication. 

Theoretically, as Maynard suggests, even the use of honorifics 

365 


may be considered to be a "socially-motivated strategy" in the sense that 
its purpose is to demonstrate that the speaker is a competent social 
person. Referring to deference represented by honorifics, Brown and 
Levinson (1987) said that deference is not encoded in language by the 
use of arbitrary forms, but by the use of motivated forms" (p. 23). They 
suggest that grammaticalized or conventionalized aspects of honorifics 
are deferential being opposed to open-ended politeness strategies. They 
state that honorifics can be motivated by various channels: (1) through 
a strategy of giving deference; (2) through a strategy of 
impersonalization; (3) through negative politeness in general for 
higher strata in complex societies; and (3') through positive politeness 
which is internal in lower strata. Brown and Levinson also remarked 
that deference is for "the most part 'frozen' or grammaticalized outputs 
of productive politeness strategies" (p. 23). Therefore, Brown and 
Levinson saw honorifics (i.e., deferential language) as strategic also 
and further claimed that this argument is supported by a variety of 
cross-linguistic research (they quoted Bean, 1978 for Kannada; Hill et 
al., 1978 for modern Nahuatl; Paulston, 1976 for Swedish; McClean, 1973 
for Nepali; Haviland 1982, for Australian Guugu Yimidhirr; Duranti, 1981 
for Samoan). 

Based on this, Brown and Levinson maintained that "there is not 
a certain quantity of politeness to be conveyed by one channel (the 
grammaticalized honorifics) or another (strategic language 
use)...politeness is usually redundantly expressed in both" (p. 25). In 

366 


their 1987 publication, Brown and Levinson's treatment of honorifics 

was not too different in effect from the two-dimensional politeness 
theories. I believe that Brown and Levinson's implication about 
honorifics is probably correct: deferential politeness like honorifics 

does not differ from intentional politeness in its goal-orientation, but 
deferential politeness can become so conventionalized or structured in 
the system of a language that its users do not need to labor hard to 
produce deference in speech, rather they just follow the social norm. 
The resulting normative nature of deferential language must be 
stronger in Asian languages with honorifics than in Western languages 
since, as is often pointed out, these Asian languages (e.g. Japanese, 
Korean, and Javanese) have no neutral forms: Speakers have to choose 

informal or formal forms, or if necessary, super honorific forms.7 A 
speaker's intentional politeness strategy and his conformation to 
socially-defined deferential rules (or "frozen strategies") may appear to 
be two different aspects of politeness in reality, but they function 
collectively to produce politeness which is adequate in each speech 
situation from the speaker's perspective. 

An example shown below is a Japanese four-year old child's 
utterance from my data. The child played in our home for eight hours 
and with one exception spoke only in plain forms (i.e., she used only 
informal sentence endings). The only time she used a polite sentence 
was to request an ice-cream cone. She asked three times and used the 

367 


exactly same sentence that included both polite form (formal sentence 
ending) and strategic politeness (a question form asking for 
permission). Notice, the sentential ending form -ka conforms to the 
standard of the model for (D) type (the hearer's territory) propositions 
for formal discourse. 

(6-3) 

sakki no aisu-kuriimu mata mora-tte- mo iidesu-ka? 
a short time ago MODI ice-cream again receive-te PERMISSION(FOR)-Q 

(May I again receive the ice-cream which I had a while ago?) 

According to Brown and Levinson, Mackie (1983) reported that 
negative and positive open-ended politeness strategies are learned by 
Japanese children before they learn deferential honorific politeness. 
In the case of the child that uttered (6-3), even though her social 
experience is still very limited and her knowledge of honorifics is at the 
novice stage, she seems to have felt it necessary to use both channels to 
be very polite. Generally, these two channels are combined in formal 
speech settings with adults' speech too. Brown and Gilman (1989) said 
that a greater number of strategies may be necessary as FTA seriousness 
level increases. This idea was implied by P. Brown and Levinson too by 
their remark that "the more effort a speaker expends in face-saving 
work, the more he will be seen as trying to satisfy the hearer's face 
wants" (1987:143). This point is intuitively appealing. 

Based on these thoughts, my view of politeness in this research is 

368 


summarized as follows: the purpose of linguistic politeness is to save 
each other's face to attain the goal of communication in each speech 
event; therefore, its use is primarily strategic. However, in Japanese 
culture, honorific and formal language use in formal speech situation is 
considered to be an almost mandatory social requirement due to unique 
grammatical restrictions in Japanese (i.e., neutral forms are not 
abundant) as well as to the historically Confucianistic social 
atmosphere. Therefore, there are two major categories of Japanese 
linguistic politeness: (1) honorifics and formal language use, and (2) 
polite linguistic behavior other than (1). Through both channels, 
Japanese speakers realize linguistic politeness. 

Type (1) linguistic politeness, the use of formal forms, is only for 
formal speech situations. However, I believe there is politeness for 
informal situations also, and that type (2) politeness behavior is used in 
both "politeness for formal speech situations" and "politeness for 
informal speech situations"; even in informal speech occasions with 
intimate partners there are some rules of linguistic politeness. The use 
of evidentiality coding belongs to the type (2) "miscellaneous" 
politeness and it is not entirely strategic but partly a set of pragmatic 
rules. 

I will characterize type (2) politeness with the politeness of 
evidentiality in the next section. 

EVIDENTIALITY CODING AND POLITENESS 

369 


Then, where in this system of politeness is evidentiality use 
incorporated? 
Some expressions corresponding to evidentiality markings in 
Japanese are mentioned as "strategic politeness" by Ide (1989) in 

contrasting with honorific expressions as "discernment." Table [6-4] is 
quoted from Ide. Ide included expressions that try to "seek agreement", 
"questioning", and "minimize imposition" (for example) within the 
category of strategic "volitional" politeness behavior. These are 
important functions of Japanese sentence-ending evidentials. 

[6-4] Two types of linguistic politeness (Ide, 1989: 232) 

Use (speaker's mode of speaking) Language (kinds of linguistic device 
mainly used. 
DISCERNMENT 
FORMAL FORMS 
honorifics 
pronouns 
address terms 
speech levels 
speech formulas, etc. 
VOLITION 
VERBAL STRATEGIES 
seek agreement* 
joke 
question* 
be pessimistic 
minimize the imposition*, etc. 
(*Underlining is mine.) 
These forms of evidentiality marking (or their equivalents) can 
also be found among positive or negative strategies in Brown and 

370 


Levinson. Charts [6-5] and [6-6] show the possible correlations that I 
observed. Strategies are numbered as originally done by Brown and 
Levinson. 

371



[6-5] 
Brown and Levinson's negative politeness strategies (1987) 
and corresponding Japanese sentence final evidentiality 
markings 

Negative politeness Japanese linguistic evidentiality 
strategies (p. 102) markings at the sentence end 

1. Be conventionally indirect 
2. 
Question, hedge 
Groups (4) and (6) evidentials 
from the model 
ka? (question) 
no? (question) 
janai?(negative question) 
daroo? (tag-question), etc. 

3. Be pessimistic 
4. Minimize the imposition Groups (7), (8), (10), (4) , (6), 
and part of (9) evidentials from 
the model 
omou (I think) 
rashii, mitaida, yooda 
(It seems) 
sooda (I heard-plain) 
daroo (probably) 
kamoshirenai (might be) 
ka? (question) 
no? (question) 
janai? (negative 

question) 
daroo? (tag-question) 

5. Give deference 
6. Apologize 
7. Impersonalize speaker and hearer: 
8. State the FTA as a general rule 
9. Nominalize 
10. Go on record as incurring a debt 
or as not indebting hearer. 
372 


[6-6] Brown and Levinson's positive politeness strategies (1987) 
and corresponding Japanese sentence final evidentiality 

markings 

Positive politeness 
strategies (p. 102) 

1. Notice, attend to hearer (his 
interests, wants, needs, goods) 
2. Exaggerate (interest, approval, 
sympathy with hearer.) 
3. Intensify interest to H. 
4. Use in-group identity markers 
5. Seek agreement 
6. Avoid disagreement 
7. 
Presuppose/raise/assert/ 
common ground 
8. Joke 
9. Assert or presuppose speaker's
Japanese linguistic evidentiality 
markings at the sentence end 

Group (3), (4), and (5) evidentials 
ne? (confirming) 
ne# (sharing) 
janai? (negative question) 
daroo? (tag-question), etc. 

Group (3), (4), and (5) evidentialsne? (confirming) 
ne# (sharing) 
janai? (negative question) 
daroo? (tag-question), etc. 

knowledge of and concern for hearer's wants 

10. Offer, promise 
11. Be optimistic 
12. Include both speaker and hearer in the activity 
13. Give (or ask for) reasons 
14. Assume or assert reciprocity 
15. Give gifts to hearer (goods, sympathy, understanding, cooperation) 
373



As shown in [6-5] and [6-6], the Japanese politeness function of 

sentence-ending forms are mainly equivalent to the following four 
strategies listed by Brown and Levinson: 
(1) "use questions" 
(2) "minimize the imposition" 
(3) "seek agreement" 
(negative 
(negative 
(positive 
strategy) 
strategy) 
strategy) 

(4) "presuppose/raise/assert common ground"(positive strategy) 
I believe that Japanese speakers use most of the strategies listed 
by Brown and Levinson in [6-5] and [6-6]: They pretend to be 
pessimistic, they apologize, impersonalize themselves and the hearers, 
attend to the hearer's interest, exaggerate sympathy, use in-group 
identity markers, try to avoid disagreement, give presents, and so on, to 
be polite. However, in terms of morphological manipulation at the 
sentence ending, which I have been attempting to characterize as 
evidentiality marking, available strategies are limited. It is noteworthy 
that these four strategies are all participant territory related: by 
"asking questions" and "minimizing impositions", a speaker tries to pay 
respect to his hearer's information territory; and by "seeking 
agreement" and "asserting common ground", the speaker tries to extend 
his hearer's information territory to overwrap with his own territory. 
These strategies work on the psychological concept of information 
territories shared by interactants and are required in order to be polite. 

The argument so far may suggest that the use of Japanese 
sentence-ending evidentials is a strategic (or volitional) part of 

374 


politeness. However, in actuality, the use of sentence-ending 
evidentials is not entirely "open-ended" or optional. The basic part of 
evidential usage is fairly conventional also in that conformance to the 
"preferred" evidential forms of each level of formality is expected by 
the community, as demonstrated with the proposed model, and 
nonconformance may immediately produce impoliteness even with the 
existence of honorifics (e.g. discourse [5-78], [5-79] in chapter five). 
The appropriate use of evidentials can be deferential in that it involves 
respect of other people's territory and knowledge. Therefore, 
evidentiality coding functions to create both kinds of politeness: 
deferential and strategic. I will examine this point with sentences with 
and without honorific and evidential politeness. 

Ide (1989) also contrasted four types of sentences with possible 
combinations of discernment and volition attempting to distinguish 
these two aspects of politeness in utterances. Sentence types are the 
following four: 

[6-7] (a) -discernment, -volitional 

(b) +discernment, -volitional 
(c) -discernment, +volitional 
(d) +discernment, +volitional 
These combinations also seem to be relevant in Korean language (e.g. 
Hwang, 1990). In Japanese as well as Korean, obviously the sentence 
type (d) that involves both discernment and volition is the politest, and 

375 


type (a) sentences with neither attribute are ought to be least polite. 
Type (b) and (c) sentences can both be polite: type (b) sentences are 
polite in terms of formality and type (c) sentences in terms of speaker's 
intention. Therefore, whether or not type (b) or (c) utterances are 
sufficiently polite needs to be evaluated in each individual speech 
situation. Ide showed examples for (a') to (d') types in Japanese as 
follows in which each corresponds to (a) to (d) in [6-7]. 

[6-8] 
(a') #Kore-o yome. (The # marks a non-polite sentence.) 
this-ACC read. 
(English equivalent: "Read this.") 
(b') Kore-o o-yomi-nasai mase. 
read-HONO FOR 
(English equivalent: "Read this.") 
(c') Kore-o yomanai ka? 
read-NEG Q 
(English equivalent: "Won't you read this?" 
(d') Kore-o o-yomi-ni-nari mase-n-ka? 
read-HONO FOR. NEG. Q 
(English equivalent: "Won't you read this?" 

(1989:226) 
Sentence (a') is most casual with plain form ending, yome (Read), and 
(d') is most formal with honorific o-yomi-ni-naru (read) and with 
negative question nasai-masu-ka (Do you?). Now, I would like to focus 
on the difference between (b') and (c'). Sentence (b') is formal due to 
its use of honorific suffix -o with the verb read and formal imperative 
verb ending, nasai-mase. Ide (1982) said that formal forms "create a 

376 


formal atmosphere where participants are kept away from each other,
avoiding imposition; non-imposition is the essence of polite behavior;
thus, to create a formal atmosphere by the use of formal forms is to be
polite" (p. 382). Certainly, sentence (b') produces a formal atmosphere,
but on the other hand, since it is an imperative without room for
negotiation by the addressee, I believe it can be adequate only when
the speaker is authoritative and has every right to give commands to
the addressee. If this kind of sentence is used for other speaker-hearer
relationships, the sentence may produce an effect of "ingin-burei (..
. 

.��
.
)" meaning insolent politeness or haughty under a cloak of
apparent politeness which is often found with statements about the
hearer's territory information as (6-9) sentences:
(6-9)


(a) 
o-uchi ga taishite hiroku-nai-n-desu kara 
HON-house NOM not much spacious-NEG-n-COP(FOR) because 
o-futari tomo konya wa oteyawarakani 

HON two people both tonight CONT easy 

onegaishimasu ne.. 

I beg you (FOR) RAPP 

[Your] house is not very big so I beg you two to be 
quiet tonight. 

(b) 
anata ga nasatte-iru koto wa zenzen buzinesu 
you NOM doing-GER(HON) COMP CONT at all business 
nanka jaarimasen.
something like NEG(FOR)


[What you are doing is not anything like business.] 

377 


Both utterance in (6-9) are polite in terms of forms involving 
honorifics but impolite in terms of information territory. In (a), the 
speaker used direct evidentials to state that the hearer's house is not big 
enough (E type proposition) to ensure her privacy during her 
overnight stay in formal forms. In (b), the direct plain evidential (jaarimasen) which was used to criticize the hearer's behavior (i.e., D type 
proposition) together with employment of some lexical items (e.g. 
zenzen, nannka ) makes the whole utterance impolite even with the 
existence of honorifics. As the examples suggest, if we define politeness 
as behavioral strategies to make possible communication between 
"potentially aggressive partners", formal sentences using honorifics 
may sound impolite if they are not accompanied with speaker's strategic 
manipulation at the sentence ending. 

The type (c') sentence in [6-8], on the other hand, is without 
either honorifics or formal ending so it is not appropriate for formal 
settings; however, the sentence can be polite enough with the negative-
question ending (yomanai-ka -Won't you try reading?) when uttered in 
informal speech situations. So this is a case of "politeness for informal 
occasion". In engaging in an informal conversation, the participants 
generally preferred plain form speech because it emphasizes the close 
(in-group) relationship among the participants (cf. emphasis on "ingroup identity" is one of Brown and Levinson's positive politeness 
strategies); however, the informants generally kept using so-called 
strategic politeness in such an informal speech setting. Strategies are, 

378 


as earlier explained, "ask questions", "mitigate imposition", "seek 
agreement", and "assert common ground" in Brown and Levinson's 
terms. An old (but still popular) Japanese saying says "shitashiki naka 
ni mo reigi ari" (There ought to be politeness among intimates). In 
order to accomplish this "intimate politeness", these strategies are 
realized by the choice of evidentiality markers at the sentence ending, 
not formal forms. 

In chapter five, it was demonstrated that speakers show 
willingness to express respect towards the hearers' knowledge and 
information territory through sentence-ending evidential forms. In 
talking about (B) type propositions (the speaker's territory 
information) in formal situation, speakers used Group (4) tag-questions 
and Group (6) question forms most often, after "rapport -ne" forms 
which was most popular. Even in informal settings, "confirmation -ne" 
was most popular. For (C) type propositions (i.e., shared information) 
in both formal and informal situations Group (4) type tag-question 
endings as well as Group (5) "sharing -ne#" endings were most 
frequently used. Although both (B) and (C) type propositions are in the 
speaker's information territory, speakers emphasized "common 
grounds". From the viewpoint of Brown and Levinson's theory, this use 
of evidentials is a form of positive politeness. Even with (D) type 
information (i.e., the hearer's territory information with which the 
speaker has no knowledge), when the topic is something favorable for 
the hearer, the speaker sometimes used rather direct evidentials which 

379 


is theoretically an imposition of the hearer's territory. But it implies 
the speaker's understanding that good news about the hearer is always 
true (cf. chapter five). This is a case of evidentiality implicature of 
positive politeness. In expressing (E) type propositions (i.e., the 
hearer's territory information), not only formal but also informal 
utterances used questioning forms from Group (4) and (6). In formal 
situations, Group (8) hearsay expressions were also preferred for (E) 
propositions even though speakers had some knowledge about the topic, 
suggesting that, in speaking about (E) type propositions, speakers 
attempted to mitigate imposition into the hearer's information territory 
(i.e., negative politeness). In particular, the use of pure question forms 
with -ka in this proposition type shows the speaker's intention to 
distance himself from the hearer's territory.

 I have claimed that this research identified and described 
systematic use of evidentiality coding in both formal and informal 
speech settings based on informant behavior. I believe that the pattern 
of commonly-preferred use of evidentiality markings in reference to 
information territory as proposed by the model may be regarded as an 
adequate and almost pragmatically required politeness rule in the same 
way that appropriate honorific use is pragmatically obligatory in 
formal settings. In addition, the use of evidentiality implicature for the 
purpose of less assertive utterances produces higher politeness. This is 
a strategic use of evidentiality coding. 

An additional function of evidentiality use in the Japanese 

380 


politeness paradigm is that evidentiality usage contributes to extending 
the domain of the speaker's volitional expression of politeness which is 
fairly restricted by socially required honorifics and related formal 
language usage. As discussed earlier, honorific usage is not 
automatically and perfectly molded by participants' social status: There 
is some room for flexibility depending on conversationalists' emotional 
relationships, and so forth. However, rational Japanese speakers still 
seem to observe the socially established honorific framework with the 
concept of wakimae. In this restricted environment, some speakers, 
following the framework of honorifics on one hand, use evidentiality 
coding to express the low degree of politeness which they judged to be 
appropriate towards a particular hearer. An example of this is the 
above-mentioned "ingin-burei" (insolent politeness) case. A hearer's 
reaction to ingin-burei utterances is often such as "the speaker is 
conventionally polite but too direct." The reaction is negative toward 
the speaker but the speaker's conformance to the social rules of 
honorific use at the surface level should be acknowledged. In this 
sense, evidentials, in relation with honorifics, may give room for 
assertion by an outspoken speaker. This is another case of strategic use 
of evidentiality, i.e., evidentiality implicature. However, again, it should 
be noted that the violation of the evidentiality rules for the purpose of 
being assertive will bring forth a negative image of the speaker. 

Therefore, evidentiality use is, from the viewpoint of linguistic 
politeness, both deferential and strategic. I summarize the entire 

381 


picture of Japanese politeness in relation with evidentiality as follows: 

[6-10] Types of Japanese linguistic politeness

 Forms of language Types of politeness Speech situation 
Formal forms 
including honorifics, 
pronouns, address 
terms,etc. 
Deference Formal 
Evidentials Deference 
Strategic 
Formal, informal 
Formal, Informal 
Others 
including use of 
ellipsis of case-
marking particles, 
back channelling, 
hedges, 
Strategic Formal 
Informal 
If we view the place of evidentiality coding through two sets of 
opposing factors, formal vs. informal and deferential vs. strategic, the 
whole picture would be as follows in [6-11]: 

382



[6-11] Types of Japanese linguistic politeness behavior in formal and 
informal speech situations 

SOCIALLY-REQUIRED 
RULES 
OPEN-ENDED 
STRATEGIES 
FORMAL SITUATION -honorifics 
-formal forms 
(including address 
terms, pronouns etc) 
-evidentiality 
coding for formal 
situations 
-hyper-honorifics 
-evidentiality 
implicature 
-back channelling, 
hedges, observation of 
turn-takings, etc. 
INFORMAL SITUATION -evidentiality 
coding for informal 
situations 
-evidentiality 
implicature 
-back channelling, 
hedges, observation of 
turn-takings, etc. 
We may possibly assume that the use of evidentiality expressions 
is "frozen" into the role of conventional politeness just Brown and 
Levinson argued that honorific usage is "frozen or grammaticalized" as 
deferential politeness (1987; 23). If honorific use is a speaker's 
"automatic" response to a formal atmosphere (e.g. Hills et al., 1986), to 
some extent, so is evidentiality marking. This view is supported by the 
informants' comments on their own speech behavior. When applicable, 
I asked informants whose selection of evidentiality markings 
conformed to my model of evidentiality (i.e., pragmatically correct), to 
tell me their reasons for choosing a certain evidentiality marking over 
another. Generally, they answered that they did not know: Somehow 
they felt that way. Some gave slightly more specific observations. The 

383 


following is a subset of their answers:8 

(In talking to somebody superior)


"We should speak about the person's personal matters indirectly.
desho and daroo are appropriate.
"
"When we talk with a superior person, we should talk indirectly.
"
"Direct forms are too decisive.
"
"Direct forms are too clear-cut.
"


Speakers did not use the word polite in their comments, and 
obviously the speakers themselves were not aware that the sentence 
final forms function to produce different levels of politeness. I would 
like to claim that speakers' general low awareness of the function of 
sentence ending forms indicates how deep the concept of evidentiality 
marking is rooted in the Japanese psychology of inter-personal 
communication. Furthermore, it is important to note that the 
appropriate use of evidentiality marking is indispensable to make all 
kinds of formal and informal speech interaction polite while honorifics 
create only formal politeness which can be "haughty politeness" in 
some cases. Data analysis shows that the use of both formal forms (i.e., 
addressee honorifics) and evidential politeness are normative in higher 
formal situations (except a few cases with only formal forms i.e., only 
deferential politeness). On the other hand, in informal speech settings, 
the use of formal form sentence-endings and hyper-polite honorific 
language decreased drastically to almost none; however, the informants 

384 


continued to use hearer-sensitive and territory-sensitive evidentials 
even in the most casual family discourse. 

From this observation, I speculate that the system of evidentiality 
markings in Japanese interacts with Japanese politeness in a very 
fundamental way: the speaker's awareness of interactants' invisible 
territory of information may be a very basic psycho-linguistic 

foundation of Japanese speakers. This view is supported by 
developmental evidence from children's speech in the following 
section. 

Socially, the concept of territory is obviously power-related. The 
use of evidentiality in Japanese women's speech provides supportive 
evidence on this point. 

DEVELOPMENTAL EVIDENCE: CHILD'S EVIDENTIALITY MARKINGS 

The argument that there are two different kinds of politeness 
behavior (i.e., normative and instrumental, or deferential and strategic) 
is supported by developmental evidence from child language 
development research. An overview of the literature shows us that 
children seem to learn two types of politeness behavior independently 

(e.g. Ervin-Tripp, Guo, and Lampert, 1990; Blum-Kulka, 1990; Snow et al., 
1990). The literature generally suggests that children acquire strategic 
variation of language use from their natural environment; on the other 
hand, formal deferential politeness is taught explicitly (e.g. Brown and 
Levinson, 1987). Snow et al. (1990) observed that parents generally 
385 


address children's positive and negative face when making requests. 
Researchers found ample use of both positive and negative strategies in 
parent-child interaction although children rarely received explicit 
instruction on how to be polite. Snow et al. concluded that children at 
younger ages are already aware of three critical factors of politeness: 
distance, power, and degree of imposition. Ervin-Tripp et al. (1990) 
observed that children become increasingly polite between the ages of 
two and five: At this stage, they identify "on-record" politeness as 
appropriate to control a certain addressee, and also use the forms of 
"formulaic social indices". By five, they can differentiate to whom they 
should be polite, and have learned how to use politeness as a persuasive 
tactic. Ervin-Tripp et al. claim that at the age of five, children conform 
to Brown and Levinson's model regarding the relative relationship 
between the degree of imposition and the social tactics used to maintain 
good human relationship, suggesting that children are capable of using 
strategic politeness at an early stage of life. In his research with four 
and five year-old children, James (1978) found that those children 
adjusted the politeness level on the basis of their listener's age status 
and the nature of the situation (e.g. command, request); and that 
situational factors take precedence over status difference. Axia and 
Baroni (1985) reported that at an early stage of life (five to seven year's 
old) children showed their ability to react to the predicted cost of their 
request according to the social situation, but their ability to be 
appropriately polite with different status addressees did not seem to 

386 


develop before the age of nine. Bates (1976) found that, from her 
research on Italian preschool children, the first politeness strategies 
that children learn are those for minimizing the offense of a request; 
children were creative in "softening" their requests. These studies 
commonly suggest that at early stage of their life, children are aware of 
the need to be polite when the situation requires, thus they learn 
strategic politeness first. 

MacKie's study (referred to by Brown and Levinson, 1987) was 
with Japanese children (1983). In Japanese, as is generally understood, 
learning a fully elaborated system of honorifics requires lengthy 
exposure to formal social environments; therefore, children may take a 
long time to become competent in honorific use. Although it is difficult 
to say exactly how long due to differences in individuals' environmental 
factors, it is said to take twenty years or even a lifetime. MacKie claimed 
that Japanese second-grade children were at least using formal 
sentential-ending forms (i.e., -desu, -masu, etc.), and that they also 
presented an early stage of strategic politeness: tone of voice, sentence 

final particles, and asking for agreement with questions or tag-
questions, which in some cases are evidentiality codings. 
The data for this research also supports MacKie's view. As for 

the second-graders, I observed that, in classroom situations, generally 
they did not use formal sentences to their female teachers. This is 
understandable because it was obvious that the children considered the 
teachers to be their "friends". Teachers did not particularly intend to 

387 


create a formal atmosphere; rather intimacy between the teachers and 
the pupils was emphasized. Edwards et al. (1978) quoted Flanders (1967) 
who called "classrooms [in English-speaking countries] as 'affectional 
desert' because almost all the talk there is devoted to official business, 
and even teaching which is cognitively stimulating has been described 
as leaving no room for passion and emotion" and said formality is 
difficult to escape in interaction in classroom (p. 24). Although this 
must be true to a certain degree, in the Japanese classrooms I observed, 
teachers called their pupils by their first name using terms of 
endearment, -chan for girls, and -kun for boys (for example, a girl 
named Nana Suzuki is addressed as Nana-chan just like in intimate 
family situations). This phenomenon was quite alien to me since in my 
own experience in the Japanese education system, teachers used pupils' 
family name with -san (equal to Mr., Mrs., and Miss in English) at least 
for girls. Obviously, a family-like atmosphere has been introduced in 
the Japanese classroom as an official public educational policy at some 
time in the past twenty years. I have observed that teachers often 
talked to pupils as they would talk to their own children or young 
friends, but sometimes use formal sentences to "straighten up" the 
classroom atmosphere. However, interestingly, there were occasions 
when second-graders consistently used formal sentences. As we noted 
earlier, Japanese speech is in either formal form or informal form so 
that switching from one to the other is usually performed intentionally 
for some reasons. One of the formal occasions for the pupils was 

388 


"Thank-You-Friend" time, in which children volunteer to express their 
thanks to one or two particular friends for whatever nice things the 
friends had done to or for them on that particular day. A possible 
explanation for the children's voluntary use of formal forms for the 
event is that children considered this speech setting to be formal since 
it is a time to be thankful to somebody. Another occasion that formal 
sentences were used by these young children was when commenting on 
other students' compositions at composition time. They were 
encouraged to praise good aspects of each other's composition without 
being critical. This indicates that at the age of seven, Japanese children 
are able to, or beginning to, understand the difference between formal 
and informal speech settings. It seems that the foundation of formal 
deferential politeness is acquired in early elementary school years. In 
the next discourse by second-graders, speaker S1 asked for his 
classmates' opinion of his composition, and they unanimously used 
formal sentence-endings to praise S1's presentation. Their level of 
evidentiality is indirect as it uses omou (I think). This sentence-ending 
form is very appropriate to give opinions on the hearer's information 
(i.e., D and E type propositions) in a rather one-way communication 
from the speakers to S1. 

[6-11] 

S1: (finishing the reading) 

owari desu. ii tokoro arimasu ka?

 end COP(FOR) good point exist(FOR) Q 

389 


S2: yoku 
nicely 
kaketete 
write(POT)(Sii 
TAT) good 
to 
COMP 
omoimasu.
think(FOR) 
S3: koe 
voice 
no 
MODI 
ookisa ga hakkiri shitette 
volume NOM clear(STAT) 
ii to 
good QUOT 
omoimasu. 
think(FOR) 
S4: kaiwa no bubun o iretete ii to omoimasu. 

conversation MODI part OBJ include(STAT) good QUOT think(FOR) 

S1: This is the end [of my composition]. Is there anything good 

with this? 
S2: I think it is well-written and good. 
S3: I think the volume of voice was clear (=big) and good. 
S4: I think it is good because conversation parts were included. 

The teacher also spoke in polite forms when conducting this 
session. Instead of asking direct questions to verify comprehension 
such as Then what did S1 do?, she asked indirect questions such as Then 
what did he say he did...?, What do you think is important there?, and 
Does it seem such and such? thus distancing S1's information territory 
from herself as well as from other pupils. I strongly felt that the 
teacher's language behavior, as a part of the learning environment, 
certainly drew pupils' attention to the given social context in which 
politeness is preferred. Vygotsky (1978), Bakhtin (1981) and other 
researchers demonstrated the importance of dialogue between a child 
and an adult in assisting children to learn critical literacy (i.e., "social 
constructivist" view). Using adults as mediators, children reorganize 
and reconstruct their social experience and internalize it as their own 
individual experience. It was suggested that children follow a certain 

390 


process before internalizing their social cognition through building an 
"intersubjectivity" which is shared with their teacher in classroom 
discourse. For example, children's writing often shows that they are 
responsive to the social and cognitive norms of the discourse 

community (e.g. McCarthey, 1994). Not only literacy but also cultural 
concepts such as politeness with regards to participants' information 
territory in language use must be learned by children in this 

interactional process with adults including teachers. 

Whereas I did not have an opportunity to observe older students' 
classes in the elementary school, I did have a chance to attend a whole 
school gathering. At the end of the observation day, there was a whole 
school meeting where all the students and teachers got together (it was 
a small school) and discussed how to prepare for the coming summer 
break. As a convention, the gathering was conducted by the student-
body that is organized by a few student representatives (the oldest six-
graders) with help of a teacher in charge of the student organization. 
The male teacher in charge, who sounded both nice and authoritative, 
used both informal and formal sentence-ending forms. He used formal 
sentences to call the whole gathering to order, and informal sentences 
to speak to a particular student. The teacher's strategy worked well in 
that his formal sentence created the formal atmosphere of the 
gathering while the informal speech to students on call enhanced his 
"authority" over the students, i.e., the male teacher used informal forms 
to indicate power difference between himself and the students. The 

391 


female teacher of the second-graders used both informal/formal speech 
also as noted earlier, however, being in contrast to the male teacher, 

her informal speech created an "intimate" atmosphere with the pupils.9 
It was performed through the use of hearer-sensitive evidential forms 
such as confirming -(yo)ne. (am I right?), and -deshoo (tag-question). 
This case corresponds to the female speaker's strategy to use evidentials 
to "involve" the hearers' knowledge and attention to the speaker's 
discourse. 

In the whole-school gathering, all students on call used formal 
sentences regardless of the forms that the teachers used. Although the 
speakers were from older graders (grades four to six), this suggests that 
they become aware of formal/informal speech settings by these ages. 

Then, what about the development of politeness of informal 
forms? As discussed in chapter five, the data showed that elementary 
school second-graders presented some initial development of the use of 
hearer-sensitive evidentials even though they rarely used formal 
sentences as shown in the following table: 

[6-12] Occurrence of formal and informal sentence-ending forms

 ADULT CHILD (7, 8, and 10s) 
FORMAL 2940 (45%) 56 ( 9%) 
INFORMAL 3515 (54%) 513 (90%) 
392



The primary reason for the dominance of informal sentences in 
children's discourse at school is, as noted, that the situations were 
considered to be informal by the young speakers. In addition, the 
speakers were not yet used to formal utterances due to insufficient 
experience with social interaction. But still the use of evidentiality 
coding which conforms to the the rules of the speaker's and the 
hearer's information territory could be seen in the children's 
utterances. Of course, it was not yet fully developed, but was already 
noticeable. (cf. chapter five) Therefore, the emergence of an 
honorific-related concept of formality and a territory-related concept 
of evidentiality may develop together starting from the early 
elementary years. 

WOMEN'S LANGUAGE AND EVIDENTIALITY MARKINGS 

There have been claims that women's linguistic behavior is both 
less assertive and politer than that of men's (e.g. Lakoff, 1975, P. Brown, 
1980). Although this is not so in some cultures (cf. Keenan about 
Madagascar, 1974a, 1974b), Japanese women, in general, have been said 
to speak more politly than Japanese men do (e.g. Ide et al., 1986, 1991; 
Wetzel, 1988). It is not the intent of this section to consider why 
different linguistic variations, which create difference in politeness 
level in particular, are found between man and women, but I will 
briefly touch upon some popular arguments among scholars regarding 

393 


the "explanations" for the existing differences between male and female 
speech norms. Some sociolinguistic studies on sex difference observe 
that women are more sensitive than men to forms of socially prestigious 
language or standard forms, formality levels of speech settings, and 
other sociological significance of linguistic variables (e.g. Labov, 1966a, 
1966b; Trudgill, 1972, 1983a, 1983b; Gal, 1978). The general findings 
from those studies demonstrate rather consistent observations with 
respect to gender-difference in linguistic variables, at least in 
urbanized societies (Trudgill, 1983a). A variety of explanations on how 
the pattern of sex-differentiated variations in language occurs had 
been proposed, but none of them seems to be satisfying. (e.g. Walters, 

1989: pp. 111-113, introducing Trudgill's examination, 1983a).10 In 
addition, there is a question of the social "norm" of linguistic variables, 
i.e., male speech has been considered to be the "norm" and female 
speech had to be explained with reasons of deviation from the "norm" 

(e.g. Coats, 1986). For example, Robin Lakoff, in her series of seminal 
writings, argued that women are taught and destined to speak in 
women's style (e.g. 1975, 1977). Lakoff described women's language and 
speech style as characteristically less assertive than men's: Female 
speech is hesitant, tentative, agreeing, trivializing, asking, and 
indirect. Lakoff attributes characters of "insecurity" of women's 
language to women's secondary social status. The basic premise of 
Lakoff's argument was that men's language is the social norm from 
394 


which women deviate and that the male norm is superior to the female 
deviation. Her work was criticized for this primary standpoint and also 
for the anecdotal data which the study mainly based on. P. Brown (1980) 
suggested that Lakoff's concept of "women's language" needs to be 
modified in that "some or all these features (of women's language) 
appear to be more closely related to social position in the larger society 
and/or the specific context" (p. 109), not primarily related to gender. In 
Brown's research on courtroom discourse, some men spoke with the 
features of women's language described by Lakoff and some women did 
not. In short, P. Brown meant that powerless people in a given speech 
setting (including the society itself) speak a "powerless language". She 
agreed with Lakoff in that powerless language may be a reflection of a 
powerless social situation, but it also would seem to reinforce their 
"inferior status". 

Originally, woman's sensitivity to the language they use must be 
related with the historical power imbalance in society, or possibly the 
traditionally acknowledged role differences between men and women. 
Trudgill (1983a, b) speculated that women tend to gain their status 
through how they appear so that they tend to secure their status by 
showing it clearly through their language of sophistication. In modern 
society, there is not an explicit demand for women to speak politer than 
men. If average female speakers really speak politer than average male 
speakers in the same circumstances, perhaps female speakers choose to 
do so strategically for their own good being aware of some social 

395 


expectation based on social sexist tradition. 

Considering the speaker's sensitivity to social power among 
interlocutors in using language, I feel that P. Brown may be correct in 
arguing that what matters is the speaker's level of power over other 
interlocutors in a given circumstance Then, we inevitably come back to 
the same point: women's general status in society. 

Japanese society has historically been a men's society where 
women's subordinate status is literally "visible". Even today, Western-
style feminism has not yet fully become an influential doctrine in 
Japanese society. McGloin et al. (1991) commented on the low 
achievement of feminism in Japan by stating that "Japanese women 
prefer a complementary vision of status and role differences, giving 
them equal dignity, despite differences in form" (p.2 of the 
introduction). Wetzel (1988) proposed that "power" as a sociolinguistic 
variable to control female or male speech may need to be redefined in 
societies such as Japanese, suggesting that talking in feminine ways is 
not always powerless in some cultures. Contrary to Wetzel, Smith-
Shibamoto claimed that Japanese women's traditional status is secondary 
to men and, accordingly, the norm of women's language is practically 
powerless (1987, 1992). Shibamoto (1992) argued, in her study of 
Japanese women's "directives", that Japanese women who are in 
positions of authority (which is not traditional in the society) appeared 
to experience linguistic conflict. Then, how do they solve this conflict? 
They may minimize their feminine speech (Reynolds, 1990), or resort to 

396 


some female "strategies". Sunaoshi (1995) found that some Japanese 
women in a managing or supervising position did not talk like male 
managers; but talked like a mother or sister in family which was 
effective in managing their subordinates without conflict (the "Passive 
Power Strategy" of Reynolds, 1990). Along with Smith-Shibamoto (cited 
below), I feel that linguistic changes are expected regarding Japanese 
women's language as their status in the modern society gradually 
changes, although significant language change will likely take a long 
time. 

The relative stability of the gendered cultural norms of 

appropriate linguistic style that constrain women to using 

nonassertive, "polite," and in certain contexts less effective forms 

of speech should not blind us to the various creative solutions to 

the problem of incongruity between these norms and actual 

social status being found by today's Japanese women. 

(Smith-Shibamoto, 1992:79) 

Returning to the issues of this dissertation, regardless of whether 
Japanese women are powerless or powerful, the reality is that they are 
said to speak politer than men do. Ide et al. (1986) and Ide (1991) claimed 
that, in expressing discernment (wakimae) politeness (i.e, honorifics 
and formal forms), gender differences affected the choice of language 
forms. Based on their survey studies, researchers have advocated three 
major factors in the women's politer speech: (1) women's lower 
assessment of the politeness level of linguistic forms, (2) women's 
higher assessment of appropriate politeness level that should be used to 
different types of addressees, and (3) the higher frequency with which 
women engage in interaction patterns which require higher linguistic 

397 


forms (1991: 65-66). For factor (1), subjects scored the politeness level 
for eighteen different Japanese forms that meant when do you go? 
Female subjects scored most of the forms lower than the male subjects 
did. For factor (2), subjects scored the politeness level that they thought 
to be appropriate to twelve addressees: the addressees are (supposedly in 
order from "low" to "high") child, spouse, delivery person, friends, 
workplace inferior, same-status colleague, neighbor, spouse's friends, 
parent at P.T.A. meeting, instructor of hobby group, their children's 
professor, and workplace superior. Female subjects scored the 
politeness level that these hypothetical addressees deserve higher than 
the male informants did (except for "child", "neighbor", "PTA 
meeting").11 Factor (3) relates to the informant's interactional 
patterns: Women reported they have more frequent interaction than 
men do with the kinds of addresses who were associated with of higher 
politeness level than their statuses actually were scored with.12 In 
short, the studies suggested that Japanese women feel some forms are 
less polite than men feel; they feel that addressee's status is higher than 
men feel; and they socialize more with people with whom they feel a 
need to be especially polite to. Although these studies are based on 
surveys which use self-reported behavior, these data suggest that 
Japanese women's deferential politeness with honorifics is higher than 
men's. Then, what are the actual forms of Japanese women's politer 
language? How are they related to the Japanese evidentiality system? 

398 


Numerous forms that reflect Japanese women's higher politeness 
have been reported: use of personal pronouns (e.g. Kanamaru, 1993), 
hyper-correct honorifics, feminine sentence-final-particles such as 
wa, no, kashira (e.g.McGloin, 1986 cited by Ide 1991), ellipsis of topic 
marker wa and the subject marker ga (Smith-Shibamoto, 1992), and if 
we include broadly pragmatic feminine behavior such as "use of 
silence", "frequent hedges", "frequent back-channelling", "avoidance 
of vulgar expressions", "observation of turn-taking", there must be 
even more aspects of feminine politeness (e.g. Shigemitsu, 1993, Suzuki, 
1993). These studies suggest that Japanese women's strategic politeness. 
Therefore, it seems that Japanese women are more polite than Japanese 
men both deferentially and strategically. 

In this study, it has been suggested that women's evidentiality 
markings, in contrast with men's, indicate women's politer linguistic 
behavior through evidentiality coding. Some evidentiality forms such 
as "questioning" (e.g. Shigemitsu, 1993) have been pointed out as 
common behavior among female speakers. The data analysis of chapter 
five also showed that female speakers actually used more question 
sentences (29% of all utterances) than male speakers (13% of all 
utterances). Also female speakers' proportion of direct forms was 
smaller than males (52% vs. 66%). The data also showed that female 
speakers were more evidentially sensitive to the hearer's knowledge 
and territory and also to difference in formality level than men were. 
Coates (1986) introduced a similar observation of Jones (1980): 

399 


Her [Jones'] most significant observation is that, where men 
disagree with or ignore each others' utterances, women tend to 
acknowledge and build on them. In other words, it seems that 
men pursue a style of interaction based on power, while women 
pursue a style based on solidarity and support. (Coates, 1986: 
115) 

Yet the hypothesis that Japanese women speak less directly than men 
about their own matters (i.e. speaker's territory information) was not 
supported in this research: Both men and women equally spoke with 
more direct language than expected about their own information as well 
as third person's information. But, in expressing shared-information 
with the hearer, female speakers positively asserted "common ground" 
with the hearer through evidentials in both formal and friend 
discourses. 

In conclusion, there may be a possible relationship between the 
effect of sex-difference on politeness in the Japanese language and 
with evidentiality markings. 

THEORETICAL RELATIONSHIP BETWEEN POLITENESS FACTORS AND 
EVIDENTIALITY MARKING 

In previous chapters, I presented two factors that affect the 
choice of evidentials: proposition type (i.e., topic) and discourse type 
(i.e., degree of formality in a given speech situation) in a way that both 
factors related with "distance". The proposition type is related with the 
"distance" between the topic and the speakers' territory and knowledge. 
Discourse type is related to "distance" among speakers. 

400 


Brown and Levinson's original formula, W = D(S,H) + P(H,S) + R

x 
x 

for positive and negative face strategies has been accepted in many 
politeness studies. Although cross-culturally, people might perceive 
the same social situations--as well as the relative importance of each 
social parameter--in different ways (cf. Blum-Kulka and House, 1989), 
the three factors, "D", "P", and "R" have been acknowledged to be useful 
in various languages (see Brown and Levinson, 1987:24). However, 
there have been a variety of suggestions regarding the "D" factor as 
discussed earlier in this chapter (e.g. Brown and Gilman, 1989). Each 
culture may possibly show unique incalculable factors that influence 
the level of politeness. Minami (1974), for example, explained that 
Japanese honorific choice is usually based on the following factors: 

[6-13] 

(a) PARTICIPANTS' RELATIONSHIP 
-gender, social status, age, in/out group membership, 
-historical relationship between the participants (e.g. One of 
them is in debt to the other.), 
-temporary social relationship between the two (e.g. at store, 
hospital, street) 

(b) TOPIC (referent) 
-the owner of the topic: formal topic, 
general topic, or personal 
topic of either of the participants 

(c) SOCIAL CIRCUMSTANCES OF INTERACTION 
-formal/casual environment, 
-group/one-to-one/one-to-many communication, 
-ways of communication (e.g. letter, telephone talk, telephone 
message) 

(d) INTRA-DISCOURSE FACTORS 
401 


-position of utterances within the discourse 

(e.g beginning/core/ending) 
It seems that (a) "participants' relationship" in above [6-13] is 
measured by "D" and "P", and (b) the "topic" factor is measured by "D" 
and "R". The major part of (c) and (d) seems to be both "D" and "P" as 
well as social conventions, but there seem to be more factors to be 
considered in order to "compute" the appropriate honorific level in 
Japanese. Ide (1989) suggested that besides social variables, there are 
psychological variables such as affinity and affect, which influence a 
speaker's choice of politeness in general, and which certainly affect 
honorific choices as well. Ide's proposal is in line with R. Brown and 
Gilman (1990), Field (1991), Koo (1995), and others who proposed a 
modification of Brown and Levinson's formula through suggesting 

"affect" or "familiarity" factors.13 Further, Hill et al. (1986) proposed to 

combine all the relational and situational factors by introducing the 
concept of "PD" ("perceived distance"). They defined "PD" as "the 
distance perceived by a speaker to exist between the self and a 
particular addressee in a particular situation and operating in a shared 
sociolinguistic milieu (p. 351)" and explained that "PD" also comprises 
the additional factor of "degree of imposition" ("DI") of behavior, thus 
"PD" is the sum of the factors of addressee status and situational "DI". 
This theory proposed the following formula: 

402



[6-14] 
PLx = PDx 

This simple formula suggests that the concept of politeness is too 
abstract to be pinned down by a few relational and situational factors. 
The strength of the "PD" concept is in that it is based on the distance 
which the speaker "perceives" between his hearer in each speech 
situation in a given culture. Perceived distance is a result of the 
interactional effect of real social distance in relation with "rank", 
"class", "group", and "psychological distance" such as "liking" and 
"familiarity". Some studies (e.g. Brown and Gilman, 1989, Wolfson, 1988) 
found different effects of these D factors, horizontal distance in 
particular, without being consistent with the Brown and Levinson 
framework which stipulates that further distance generally produces 
higher politeness. Hill et al.'s formula, PLx = PDx, accommodates all 
possible D factor-related aspects. This concept may seem to be indefinite 
but I believe it is acceptable; since we produce polite behavior possibly 
from two or more different channels, and emphasis on each channel 
may be different in each speech situation in a given culture. Also the 
number of influential situational/relational factors as well as emphasis 
on each differs from the culture to another; each culture may require a 
complex formulation of influential factors on politeness to 
systematically predict the politeness behavior of the members of that 
culture. For these reasons, I believe that Hill's formula is sufficiently 

403 


abstract to be generalized in universal terms. Shibatani (1990) also 
emphasized that the basic concept of honorifics is "distance" and that 
formality factors contribute to produce distance between people: 

The honorific system appears to be ultimately explainable in 
terms of the notion of (psychological) distance. Honorifics 
(inclusive of the polite forms hereafter) are used in reference to 
someone who is psychologically distant. The formality factors 
that tend to trigger honorifics contribute to creating a sense of 
distance between people. The use of honorifics toward someone 
unfamiliar, regardless of the addressee's social standing, and the 
non-use of honorifics toward someone familiar, even if the 
addressee's social standing is higher, are both controlled by the 
factor of psychological distance. 

(1990: 379) 

Shibatani, in agreement with Fillmore (1975), also commented 
that honorifics universally "can be considered as deictic expressions by 
virtue of their role of anchoring the referent and speech-act 
participants in particular social locations, i.e., status" (p. 378). He 
remarked that this deictic function of honorific forms in relation with 
the social status of the referent is presumably universal. 

Although I have attempted to demonstrate that the concept of 
perceived distance seems to be useful to see "overall" politeness across 
cultures, some modification may be necessary for Japanese from the 
perspective of evidentiality marking. Through the previous chapters, I 
have demonstrated that politeness level should be determined not only 
by the association of the "distance perceived by a speaker to exist 
between the self and a particular addressee" and the "degree of 

404 


imposition of the behavior", but also a "perceived distance between the 
participants and the referent". A speaker also perceives a distance 
between self and the referent, and a distance between his addressee and 
the referent. Therefore in deciding the overall politeness level a 
speaker ought to perceive three kinds of distance: 
[6-15] 

(1) distance between himself and the addressee 
(2) distance between the referent and himself 
(3) distance between the referent and the addressee 
Distance (1) is a social difference in such as (a) hierarchical 
organizational rank, (b) economic status, (c) gender, (d) age, (e) race, 

(f) familiarity, (g) liking, and (h) interpersonal history between the 
two participants. This kind of distance also includes some situational 
factors. For example, the act of answering an intimate friend's question 
as a presenter of a paper at an academic conference requires higher 
level of politeness than one's usual conversation with the same 
addressee. Also, we might use higher politeness in writing than when 
talking to the same person face-to-face. In this way, the situational 
setting may affect this type of distance. 
Distances (2) and (3) are important from the perspective of 
evidentiality marking in Japanese. This kind of distance should be 
correctly perceived by the speaker in order to conform to the rules of 
information territory; the speaker must consider how much the hearer 

405 


knows about the referent. A speaker makes "inferences" based on 

perceived distances (2) and (3) and thus chooses appropriate evidential 

forms. 

It should be emphasized here that, like the Japanese honorific 

system, the notion of distances (2) and (3) is not absolute but relative 

(cf. Corollary Four of relativity of information territory, chapter five). 

Again, Shibatani explained this "relativity of distance" in relation with 

the honorific system as follows: 

One of the characteristics of the Japanese honorific system is that 
this notion of distance is relativized in such a way that the same 
person can be distant or close depending on the distance between 
the speaker and the addressee. When the speaker and the 
addressee are close, and the referent is distant, then referent 
honorifics (subject or object honorifics) will be used. Thus, 
when a mother and daughter are speaking about the father, 
honorifics in reference to the father are or can be used 
(depending on how strict the family is). However, the daughter 
is not supposed to use honorifics in reference to her father when 
she is speaking to someone outside her family. Likewise, in 
reference to the company president, colleagues would use 
honorifics when speaking among themselves. But when they are 
speaking to an outsider, e.g. a customer, the president is placed on 
the speaker's side, and no honorifics would be used in reference 
to the president. 

(1990: 379) 

In this way, the pschological distance that a speaker perceives 

among participants and referents is relative. This perception of 

relative distance plays a crucial role in Japanese politeness. As noted, a 

speaker's perception of his information territory is also relative 

depending on in-group or out-group speech situations. The 

evidentiality system is largely based on a speaker's perception of the 

territory to which his proposition belongs, therefore the evidentiality 

406 


system is also sensitive of in-group/out-group speech settings. 

As explained in Corollary Two of chapter five, certain kinds of 
information (e.g. the speaker's personal matters) are socially defined as 
belonging to the speaker's information territory. Therefore, his 
company president's personal matter falls into the secretary's own 
territory when he talks to somebody outside of his company, while the 
same information, naturally falls into the president's information 
territory when the secretary talks to the president himself. Therefore, 
a speaker needs to set-up a distinctive framework of information 
territory between his hearer and himself for appropriate evidentiality 

markings in a similar way when he uses honorifics; hence, close 
association between Japanese politeness and evidentiality marking is 
also suggested. 

407



NOTES: CHAPTER 6 

1In Leech's theory (1983), indirect speech acts are evaluated as 
"polite" while direct speech acts are considered polite under very 
restricted speech circumstances. However, there are some conflicting 
observations on this point. For example, Blum-Kulka (1987, 1990) claims 
that indirect speech does not necessarily imply being polite and direct 
speech is not invariably impolite. She argues that the need for 
pragmatic clarity and the need to avoid conflict should be balanced. 
Sometimes, lack of clarity makes a speech indirect and impolite. Blum-
Kulka reports that the balance between the two needs is most achieved 
in conventional indirectness. I agree with Blum-Kulka in that being 
indirect and being polite are not the same phenomena, however it must 
also be true that indirectness is one of the ways to attain polite 
behavior. 

2Fraser (1990), for example, categorized the major existing ways 
to view politeness into four groups: the conversational-maxim view (e.g. 
Grice, 1967, first published 1975; Lakoff 1973, 1979; Leech 1983), the 
face-saving view (e.g. Brown and Levinson, 1978, 1987), the social-norm 
view (e.g. Kasher, 1986), and the conversational-contract view (e.g. 
Fraser 1975, Fraser and Nolen, 1981. 1990). 

3In the strategic view of politeness, the purpose of politeness is 
claimed to be the attainments of a "goal"; however, Ervin-Tripp et al. 
(1990) documented a case of younger children who seemed to discard 
tactical politeness in order to attain their goals. They found school-age 
children "dropped" the use of polite forms and mitigators, after 
increasing use of polite forms from the age two to five. One of the 
speculated reasons claimed by the researchers is that the children 
found that politeness is not sufficiently persuasive; in fact it can reduce 

408 


compliance in speech settings with peers and older siblings; and urgent 
pressure can be more persuasive than polite requesting behavior. This 
is a case in which speakers avoid politeness to be goal-oriented. 

4 In her early seminal writings, Lakoff did not clearly define 
politeness but in her article, "the logic of politeness", she wrote that 
"politeness usually supersedes: it is considered more important in a 
conversation to avoid offense than to achieve clarity. This makes sense, 
since in most informal conversations, actual communication of 
important ideas is secondary to merely reaffirming and strengthening 
relationships" (1973: 297). Also in "Language and woman's place" (1975) 
she wrote "as is often suggested, politeness is developed by societies in 
order to reduce friction in personal interaction..." (p. 64). These 
statements imply that Lakoff viewed politeness as a means to avoid 
friction in human interaction. Later she defined politeness clearly as "a 
means of minimizing the risk of confrontation in discourse---both the 
possibility of confrontation occurring at all, and the possibility that a 
confrontation will be perceived as threatening" (1989: 102). 

Having adapted the Grice's framework, Leech said that "far from 
being a superficial matter of 'being civil', politeness is an important 
missing link between the CP and the problem of how to relate sense to 
force" (1983: 104). He defined politeness as "those forms of which are 
aimed at the establishment and maintenance of comity, i.e., the ability 
of participants in a socio-communicative interaction to engage in 
interaction in an atmosphere of relative harmony" (Watts, 1989:46). 

Fraser and Nolen's view of politeness is called a "conversational 
contract" which says that conversational partners of any kind enter 
into a conversational contract which is primarily determined by factors 
prior to the conversation, and that during the course of interaction 
both parties re-adjust/re-negotiate the conversational contracts 
regarding each party's mutual rights and obligations. In their view, 

409 


each action that violates the conversational contract results in 
impoliteness. Fraser and Nolen characterized politeness with a few 
remarks. First, politeness is a property associated with a voluntary 
action. Second, no sentence is inherently polite or impolite. Third, 
whether or not an utterance is heard as being polite is totally in the 
hands of the hearer. Finally, there is some kind of a continuum of 
politeness rather than being a dichotomous notion (1981: 96). However, 
politeness itself was not explicitly defined. 

Brown and Levinson, basing their model of politeness on social 
theory, presupposed that "the problem for any social group is to control 
its internal aggression while retaining the potential for aggression 
both in internal social control, and, especially, in external competitive 
relations with other groups" and from this perspective "politeness, 
deference, and tact have a sociological significance altogether beyond 
the level of table manners and etiquette books... Politeness, like formal 
diplomatic protocol (for which it must surely be the model), presupposes 
that potential for aggression as it seeks to disarm it, and makes possible 
communication between potentially aggressive partners" (1987: 1). 

5 Hills et al. (1986) explained wakimae (discernment) as "the 
almost automatic observation of socially-agreed-upon rules" which 
applies to both verbal and non-verbal behavior. A capsule definition 
would be "conforming to the expected norm" (p. 348). Part of this 
system is honorific language use in Japanese, but it certainly involves 
other behavioral patterns of Japanese (and probably other Asian) 
people. Some researchers treat discernment as being almost equivalent 
to deference, but there are some differences between the two concepts. 
Hills et al. and also Ide (1982) used wakimae to describe the entire social 
common sense behavioral rules among people in Confucianistic 
societies although it seems that wakimae (discernment) in language 
involves deferential language. It is also similar to Kasper's (1990) 

410 


"social-indexing". Treicher et al. said that "deference is power as a 
social fact, established a priori by the differential positions of 
individuals or groups within the social structure" (p. 65 quoted by 
Hwang, 1990). Goffman wrote that "deference ... is that component of 
activity which functions as a symbolic means by which appreciation is 
regularly conveyed (1971: 56 quoted by Fraser, 1981). Fraser commented 
on this remark and said "the sense in which Goffman uses the term 
'appreciate' reflects a giving of personal value to the hearer, the giving 
of status, and by doing so creating relative symbolic distance between 
the speaker and the hearer" (1981: 97) Discernment as well as 
deference reflects the relative status of the interactants on a 
hierarchical social dimension. However, the sense of discernment is 
based on the community members' understanding of the social value of 
their place in the society which they have gained through experiential 
knowledge; therefore, discernment is "process"-oriented, and not 
exactly the automatic adoption of social order that is forced upon each 
community member. It is the result of an individual's analysis of the 
relationship between his social value and other community members'. 
An individual, therefore, has the possibility of showing his rational 
understanding and respect of his place through discernment. In any 
case, his behavior will be treated by others as evidence of his 
understanding (or failure of understanding) his place in the social 
order. Ide (1991) stated that discernment (wakimae) is something that is 
partially realized by obeying the rules of "formality" and "deference" in 

R. Lakoff's framework (p. 65). (This note about the discernment is based 
on personal discussion with Dr. K. Walters of University of Texas at 
Austin.) 
6Janney and Arndt explained "social politeness" and "tact" from 
three different perspectives: focus, frame, and function. They said that 
social politeness focuses on the "group" (other members of the group) 

411 


providing the speakers with socially appropriate communicative forms, 
norms, routines, etc, and social politeness functions in the interactional 
frame (people's need for smooth interaction with other members of 
their group). Its function is regulative in facilitating the coordinated 
exchange of routine conversational roles and responsibilities. Tact, on 
the other hand, focuses on the partner in providing interpersonally 
supportive communicative techniques, styles, and strategies. The frame 
for tact is interpersonal since it is concerned with people's need to 
preserve face and maintain positive relationships with others. The 
function of tact is conciliative in that it helps avoid threats to face, and 
facilitates the peaceful negotiation of interpersonal affairs. Note that 
they made a distinction between "the group" (the target of "social 
politeness") and "the partner" (the target of "tact"). I assume that in 
their framework social politeness is the same as conforming to social 
normative rules and convention, so the target was set as the group. 

7However, there is no dichotomy that formal forms are polite and 
informal forms are not polite; the degree of politeness that polite/plain 
forms create all depends on each speech setting. (Refer to chapter four, 

p. 125.) 
8I also obtained the same information from a questionnaire. 
Before collecting data in 1996, I used a questionnaire to elicit self-
reported data from participants on the sentence-ending forms of 
evidentiality markings. I described twelve different speech contexts 
following Kamio's six different information territories both in formal 
and informal settings. I asked the participants to choose utterances 
which they felt were appropriate from listed alternatives or to write 
utterances with their own words if appropriate, and to explain the 
reason that the chosen evidentiality marking is better than others. In 
the end, it was decided not to carry out this questionnaire for 1997 data 

412 


collection since the work would be too extraneous, but in the pilot study, 
I had more then a dozen reports that remained informative. 

9In the same way, the speech style of Japanese women in 
management positions has been analyzed as "motherese" style in 
resolving the conflict between socially expected women's powerless 
speech and their actual authoritative position (e.g. Smith-Shibamoto, 
1992; Sunaoshi, 1995) 

10Trudgill examined the sociolinguists' explanations of their 
findings in gender based speech differences. Explanations include (1) 
"researcher's rejections" that the proposed differences between male 
and female speech do not exist in reality, (2) male researcher's sexist 
interpretation of data, (3) female speakers' status-consciousness being 
higher than male speakers, and so on. 

Cameron and Coates (1985) and Cameron (1985) also analyzed 
possible explanations. Those are summarized with five aspects: (1) 
"conservatism" (women are more conservative than men, so they stick 
to traditional standard prestigious forms), (2) "social climbing" or 
"status" (women are more sensitive than men to the social meaning of 
speech, and imitate prestige usage in order to elevate their social 
status), (3) "feminine identity", (4) "covert prestige" (masculinity 
cultivated by males has real prestige for working-class males so that in 
reality the standard form that used by women is not prestigious), and (5) 
"solidarity" (women's social network is loosely-knit so that women do 
not feel the pressure that men feel with vernacular norm). Cameron 
and Coates demonstrate that these observations are, more or less, 
problematic. For example, some studies observed women are 
"conservative" and some observed women are "innovative" (e.g. Labov, 
1972a). Cameron and Coates argued that "it appears women are only said 
to be conservative when the attribute is out of favor" (1985: 143). They 

413 


also argued that "status" and "covert prestige" explanations are also 
problematic due to commonly used research methodology which 
stratifies women as subordinate to man (i.e., father or husband) which 
is not often realistic. The observation suggests an existing problem of 
using the traditional model in which the family is considered as the 
primary unit of stratification. 

11"Child care", "socialization with neighbors", and "attending 

P.T.A. meetings as children's guardian" characteristically belong to a 
traditional women's domain of responsibility in Japanese society. 
Therefore, it is highly practical to assume that women have more 
intimate feelings toward these addressees than men do, and this 
intimacy reduces the politeness level that these targets deserve in 
women's scoring. 
12This analysis is a little complicated, but very interesting. The 
research attempted to separately view the "politeness level assigned by 
the subjects to the addressee's status" [factor (2)] from "the politeness 
level of the actual language forms that the informants claimed to use to 
the addressees of the status" [combination of factor (1) and (2)]. These 
two factors had often been unquestioningly viewed as identical. In this 
research, the researchers found discrepancies between the two levels of 
politeness: the politeness levels of language that the subjects claimed to 
use to certain kinds of addressees (i.e., "spouse", "delivery person", 
"friend", "neighbor", "spouse's friend", "parent at PTA meeting", 
"instructor of hobby group" and "children's professor") are higher 
than the politeness levels that the informants assigned to those 
addressees. Therefore, for these addressees (group 1 addressee), the 
subjects are choosing politer sentences than they actually think the 
addressees deserve while addressees of "work-place inferior", "samestatus colleague", and "work-place superior" (group 2 addressee) 

414 


received lower level of politeness in actual language forms than their 
status received in the informants' assessment to the status of addressees. 
The researchers claimed that female speakers have more frequent 
contacts with the group 1 addressees than with group 2 addressees, 
therefore, women are more likely to be overly polite to group 1 
addressee. 

13Field (1991 quoted by Koo, 1995), and Koo (1995) also used a 
modified version of the formula from Brown and Levinson's model to 
calculate the weightiness of FTA and politeness level for strategic 
politeness and discernment politeness respectively. Field broke down 
the variable of social distance into three separate variables: the 
familiarity (F) of the speaker with the addressee, the relationship a ffect 
(A), and the f amiliarity-by-a ffect interaction (F x A). Field's study 
(1991) with American subjects is reported to have confirmed that 
politeness was a function of Affect, Power, Risk and interaction of 
Familiarity and Affect (Koo, 1995: 130). On the other hand, in Koo's 
research with both American and Korean subjects, Affect was not a 
significant predictor of politeness. He concluded that Power and Risk 
were undoubtedly related to politeness; however, the function of Affect 
and Familiarity needs further investigation. Field's formula for 
volitional politeness is as follows: 
[6-16] 
Wx = P(H, S) + F(S, H) + F X A(S, H) + Rx 
PLx = P(H, S) + F(S, H) + F X A(S, H) + Rx 

Wx : the weightiness of the FTAx 

P(S,H) : the power that the addressee has over the speaker 

F(S,H) : the familiarity of the speaker with the addressee 

A(S,H) : the positiveness of affect (i.e., liking) of the speaker 

toward the addressee 

Rx : the degree to which the FATx is rated as an imposition in 

415 


that culture. 
PLx : the level of politeness used by the speaker to the addressee 

The question that arises here is whether two different types of
politeness can be measured in isolation from each other in a certain
utterance. Koo (1995) assumed that the same three relational factors
determine the level of discernment. When a speaker is not exercising
any FTA (i.e., non-Rx situation) "pure politeness" is shown in his
speech, and only interpersonal distance factors decide the level of
politeness (discernment politeness). Koo's formula is as below:


[6-17]
PLx = P(H,S) + F(S, H) + A(S, H) + F x A(S, H) (for discernment politeness)


However, as I argued earlier, we can assume that no interaction can be
one hundred percent free from potential FTA, thus "pure" politeness in
Koo's sense may not exist. It should also be noted that Brown and
Levinson articulated that their discussions of 'calculating' the relative
weight of an FTA are to be taken metaphorically and that they are not
concerned with (or interested in) efforts to operationalize their theory
in positivistic terms.


416



CHAPTER 7: CONCLUSION 

In this research, I have tried to empirically demonstrate several 

major contentions. Those arguments are listed in the following to 

describe the research results together with the summary listing for 

chapter five (p. 325): 

(1) Japanese sentence-ending forms present the strongest modality 
marking in a sentence that includes evidentiality in which a speaker 
expresses the degree to which he commits himself to his proposition. 
Sentence-medial evidentiality codings generally function to mitigate 
the effect of assertive sentence-ending forms. 
(2) The choice of sentence-ending evidentiality is not 
grammaticalized; however most often the use of situationally 
appropriate evidentials is a pragmatic requirement for a competent 
speaker of Japanese, because the proper evidentiality concept functions 
to make the utterance polite in both formal and informal speech 
situations. The use of proper sentence-ending evidentials, together 
with the appropriate use of honorifics and other formal forms, is a 
pragmatic requirement in formal speech situations. In addition, the 
appropriate use of evidentials is also required to be competent in 
informal speech situations by providing "intimate politeness" that 
creates a harmonious interpersonal mood (i.e., the concept of " wa"). A 
speaker who is incompetent in using situationally appropriate 
evidentials might be stigmatized in Japanese community due to the 
overly assertive nature of his linguistic behavior. 
(3) The proposed model of Japanese evidentiality is based on the 
concept of territory of information. There are certain types of 
information to which a speaker has socially authorized primary access. 
417 


Each individual has information which belongs to his own information 
territory that he can claim by using direct evidentials. Indirect 
evidentials are used to express information which does not fall in the 
speaker's information territory. Therefore, the Japanese system of 
evidentiality is not based on the speaker's experience only. I argued 
that native Japanese speakers share the concept of information 
territory and tend to express their respect for other people's 
information territory--particularly the hearers' information territory-through the use of appropriate evidential forms. 

(4) The model suggests that speakers are more respectful of other 
people's information territory in formal speech situations than 
informal speech situations: speakers tend to be more indirect in formal 
communication (i.e., emphasis of distance). In informal situations, 
speakers did not unanimously respect other people's information 
territory, but did show respect for the hearer's information territory in 
particular. In informal communication, the evidentials of shared 
information are most emphasized (i.e., emphasis of closeness). The 
evidentials of shared information among conversationalists are a 
characteristic of Japanese evidentiality coding. 
(5) The use of appropriate evidentiality is both discernmental (or 
deferential) and strategic. To a certain degree, the use of commonly 
preferred evidential forms can be considered as a kind of commonsense discernmental linguistic behavior. Strategic use of non-standard 
evidentials functions as "evidentiality implicature". Characteristically, 
evidentiality implicature is strategically used for either showing 
intentional assertiveness or being more indirect than required. 
(6) The phenomena explored in this research have not been paid 
sufficient attention to, nor even understood, even by teachers of 
Japanese. As a result, learners of Japanese tend to use simple direct 
418 


endings which are grammatically correct but in reality are not popular 
among native speakers due to a strong nuance of assertiveness. Since 
there is an apparent discrepancy between grammatical sentence forms 
and situationally appropriate sentence-ending forms from the 
evidentiality perspective, this study may provide a pedagogical 
implication for teaching the Japanese language to non-native speakers. 

My interest in this research topic arose from my teaching 
experience with American students who made me wonder about the 
reasons for the difficulty of learning and using natural sounding 
Japanese sentence endings. To begin with, it was necessary to analyze 
what "natural" ending forms are. It was observed that learners do not 
acquire situationally appropriate Japanese sentence-ending forms for 
the simple reason that they are not explicitly taught the system (i.e., the 
model I proposed). It was also observed that learners transfer their 
native concept of evidentiality into the target language, Japanese. 
Thus, the issue is partly a cultural matter. 

JAPANESE HOMOGENEOUS CULTURE AND INFORMATION SHARING MILIEU 

At the beginning of this dissertation, I wrote that I treat the less 
assertive nature of Japanese speech as a "linguistic" phenomenon. 
However, obviously the issue is closely tied with culture. The close 
relationship of the evidentiality concept in politeness with the 
Confucian wakimae (discernment) concept suggests the significant role 
of culture in this linguistic issue. 

419 


Furthermore, the territory-conscious psychology of Japanese 

may also be a cultural issue. In chapter one, I briefly discussed Hall's 

idea of high- and low-context cultures. Hall (1976) hinted at the 

totalitarian character (or "collective egos" by Araki, 1980) of the 

Japanese approach to life as follows: 

In Japan, the over-all approach to life, institutions, government, 
and the law is one in which one has to know considerably more 
about what is going on at the covert level than in the West. It is 
very seldom in Japan that someone will correct you or explain 
things to you. You are supposed to know, and they get quite upset 
when you don't. Also, Japanese loyalties are rather concrete and 
circumscribed. You join a business firm and, in a larger sense, 
you belong to the Emperor. You owe each a debt that can never 
be repaid. Once a relationship is formed, loyalty is never 
questioned. What is more, you have no real identity unless you do 
belong. This does not mean that there aren't differences at all 
levels between people, ranging from the interpersonal to the 
national. It is just that differences are reexpressed and worked 
out differently. As in all high context systems, the forms that are 
used are important. To misuse them is a communication in itself. 
(97-98) 

Hall's observation was made only twenty one years ago, but I 

should say that this view of Japanese culture is slightly anachronistic 

(and was likely so even at the time it was made). Most Japanese no 

longer feel that they "belong" to the Emperor, and nowadays people may 

not be concerned with Confucian "debt which is never repaid" (that is 

inherently attached to one's existence). 

But Hall's observation does hold some truth regarding unique 

Japanese cultural behavior: It is considered impolite to explain things 

in detail since the hearer might already know; an individual tends to 

identify himself with groups to which he belongs and maintain loyalty; 

420 


and all kind of customary "forms" are important in human 
relationships. 
The preferred behavior of avoiding correction and explanation 

referred to by Hall is a part of Japanese lack of assertiveness. Japanese 
society is, as is well-known, homogeneous, thus, for an individual, to 
expect (or pretend to expect) that other people share the same 
information functions very effectively to create a mood of 
homogeneity. Clearly speaking what one believes to be true is not 
preferred by either speakers or hearers, thus speakers tend to be 
ambiguous regarding the core meaning of their assertions and allow 
the hearer to decipher the meanings based on an assumed common 
understanding. Due to possible ambiguity created by this behavior, 
effective communication is not always realized. 

In terms of evidentiality coding, the emphasis on shared 
information among the speakers through "evidentials for shared 
information" was strongly confirmed across all speech situations. The 
evidentiality behavior in this research presents an aspect of the 
traditional cultural behavior of emphasizing common background 
information among group members. An observer may speculate that a 
Western individual-oriented culture is likely to be more sensitive to 
each other's information territory than a group-oriented Japanese 
culture. No contrastive analysis was done between East and West in this 
study; therefore, an observation on this point would not be empirically 
valid. Although, in my 1993 study, both American and Japanese 

421 


informants showed consciousness of a difference in the speaker's and 
the hearer's information territory, the difference was that Japanese 
informants were more sensitive to information shared by participants' 

territories. This tendency in Japanese was also confirmed in this 
research. Presumably both Western and Japanese cultures have the 
concept of both personal territory and group territory, although the 
emphasis may be placed differently. 

Yet at the same time, it was noted that even in informal Japanese 
situations, the speaker does not breach the information territory of the 
hearer. Thus the techniques on "emphasis of common knowledge" and 
"respect for each other's personal group information territory" were 
found to be important in Japanese with regards to the concept of 
information territory. These two evidentiality aspects seem to 
correspond to the social uchi vs. soto concept in Japanese. 

JAPANESE UCHI VS. SOTO CULTURE AND TERRITORY OF INFORMATION 

Japanese people's loyalty to groups and sense of identification 
with groups typically accounts for the group-orientation of Japanese 
society. Hall (1976) also stated that high-context cultures make greater 
distinctions between insiders and outsiders than low-context cultures. 
The idea is that Japanese society is administered through the "logic of 
group" while Western societies are driven through the "logic of 
individual". A Japanese anthropologist, Watsuji (1935), earlier proposed 
this kind of concept, a dichotomy of uchi (inside) vs. soto (outside). He 

422 


argued that in Japanese culture an individual embraces the concept of 
uchi (lit. household) as being the group(s) to which he belongs. The 
important aspect here is that the concept of uchi is relative and flexible. 
The smallest unit is said to be a household, but uchi can also mean 
vicinity, school, business organization, or the Japanese race. This 
relativity of the uchi concept leads to the relativity of information 
territory in that one's information territory is similar to a person's uchi 
territory. Within uchi, members feel safe and comfortable; they 

cooperate, and rely on each other excessively.1 This uchi and soto 

concept is seen in the use of the Japanese language, including 
grammar. Wetzel (1984) devoted her dissertation to this uchi/soto 
concept in Japanese linguistic phenomena such as polite forms, 
donatory forms, and deixis. 

One example of uchi/soto-related grammar in Japanese is the use 
of go/come verbs added to other action verbs to emphasize the action of 
going and coming while performing some target actions. Ando (1986) 
showed the following examples of sentences to explain this grammar 
point: 

(7-1) 

(1) a. 
itte-kimasu. 
go(te)-come(FOR) (I will leave [and will come back].) 
b. 
hanako ga hon o katte-kita 
Hanako(name) NOM book ACC buy(te)-came
423 


(Hanako bought a book [and came back].) 

(2) a. 
taroo ga kaette -itta. 
Taroo NOM return home(te)-went. 
(Taroo went back home [and left].) 

b. hanako ga 
hon o katte-itta. 
Hanaka NOM book ACC buy(te)-went.
(Hanako bought a book [and left].) 

Ando explained that if a speaker adds the verb kuru (to come) to 
another action verb--for example, iku (to go) and kau (to buy) in (71)(1)--the compound verb phrase indicates that a person (or his 
behavior) is coming to the speaker's territory, and if the verb iku (to 
go) is added to main action verbs as in (2) sentences of (7-1), it 
emphasizes the action is going out of the speaker's territory. Ando 
argues that this structure of an action verb plus iku (to go) or kuru (to 
come) is a cultural artifact of the Japanese uchi/soto distinction which 
is critically important in Japanese psychology. In this sense, the social 
aspect of group-orientation has a common ground with the linguistic 
aspect of territory-consciousness in language use. 

Watsuji also said that individual distinctions disappear in uchi 
circumstances: in other words, the solidarity of individuals is not 
important in uchi. I wonder to what degree this observation is true in 
present Japanese culture, but this likely holds some truth in contrast 
with Western cultures. At least, people are often "encouraged" to do as 
other people do, and as long as one behaves as others do, one is "safe" in 

424 


the society.2 Regarding evidentiality, in family speech settings, 
certainly speakers were less attentive to each other's information 
territory than they were in a formal or friend discourse. I have 
attributed this phenomenon to the issue of politeness, but at the same 
time, the tendency of using direct evidentials in family discourse can be 
considered to be a representation of the unindividualistic atmosphere 
inside uchi . The uchi concept from the sociological viewpoint may 
represent collectivism, while with the viewpoint from information 
territory, uchi represents sharing of the same information territory. 
Social and linguistic uchi phenomena are seen in culturally 
conventional behavior. Ando (1984) pointed out that, for example, in 
Japan when an individual's family member(s), especially children, 
receive a gift, the individual is expected to thank the sender of the gift 
even though he himself may not be the ultimate receiver of the gift. A 
Japanese wife is (traditionally) expected to greet her husband's business 
associates (including superiors, colleagues, or even lower-status 
workers) by saying such as itsumo shujin ga osewa ni natte-imasu (lit. I 
know that you always take good care of my husband). It sounds strange 
in English translation but conveys a humble appreciation in Japanese. 
This kind of utterance is absolutely necessary in dealing with family 
members' soto relations and if one does not perform in this manner, he 
will be labelled socially incompetent. If a family member commits some 
crime, it will possibly result in the end of all family members' normal 

425 


life. It is common for the family of a criminal to publicly apologize for 
the actions of this family member. This happened recently this year 
(August, 1997) in the case of a ninth-grader who beheaded a second-
grader, drawing the attention of the entire country due to the 
criminal's cruelty and his propaganda against the Japanese education 
system. 

This is in stark contrast to American mainstream culture, where 
the role of the family is often to proclaim the guilty party's innocence 
and good character, or possibly to expound on reasons that the guilty 
party is not responsible for his actions. This difference is likely seen 
because in a Japanese group, a member is considered responsible for 
the actions of all other group members, and the group as a whole is 
responsible for maintaining harmony with other group units within 
the same larger group. In contrast, in American culture, group 
members are primarily responsible for looking out for other group 
members, when galvanized by outside pressure. 

The shame, responsibility, and guilt the family feels toward the 
sekan (society) may be an influence of Confucianism. Ando concluded 
that such Japanese uchi behavior indicates that the life of all members 
is "connected" ("renzoku-teki ningen kankei") in an uchi environment. 

I have observed that some phrases of appreciation are 
conventionally used among members to emphasize the "connected" 
human relationship. The phrase okagesama de (lit. thanks to you, 
[something good has been accomplished]) is a necessary response to 

426 


praise for something that has been done or happened to the speaker 
himself or his immediate family. Usually the hearer of the phrase has 
nothing to do with the incident; thus, the underlining meaning is 
something like thanks to your support which is created by your kind 
existence. This Japanese uchi-related behavior is so conventional that 
it should be considered to be "forms" for both formal and informal 
environments. Thus, Japanese conventional forms of social interaction 
emphasize the "connection" among members of uchi; in other words, it 
presents an aspect of territory-consciousness in Japanese language use. 

LESS ASSERTIVE JAPANESE CULTURE 

In a society such as this with a strong emphasis on "wa " 
(harmony among people), being assertive is not a good idea. Avoidance 
of conflict, which results in less assertive linguistic behavior, is often 
said to be one of the stereo-typical aspects of Japanese "wa" culture. In 
her study on the "functional interdependence" between conflict and 
culture, Ting-Toomy (1985) argued that conflict and culture are two 
inseparable concepts, and said that high-context cultures such as 
Japanese have high "cultural cognitive, emotional, and behavioral 

constraints"3 on conflict which suppresses interpersonal antagonism, 
public tension, and public confrontations. In reality, I know that 
Japanese people do occasionally have confrontations in public, but 
certainly the basic cultural agreement of direct confrontation 

427 


avoidance is probably valid. "Nemawashi" (lit. root binding) is a famous 
custom which is well-known as a confrontation-avoidance system. 
Nemawashi is the use of "unofficial" discussions and negotiation for the 
purpose of securing the agreement of the members before the "official" 
decision-making process. For example, if a group of people within a 
larger organizational body want to reach a certain organizational 
decision, they contact other important members, one by one, explaining 
their views and persuading them. Thus, when the time comes to have a 
formal discussion to decide the issue, almost all participants already 
share the same view, and the final official decision is instantaneously 
reached smoothly without antagonism. This custom of nemawashi seems 
to be deeply rooted in Japanese culture. Phrases such as nemawashishitokoo (let's do "nemawashi" in advance) can even be heard from 
middle-school students. 

Ting-Toomy (1985) also mentions the similar "ringi-sei" and "go-
between" systems of Japanese culture. The ringi system (lit. circulation 
discussion) is used to involve a large number of people in a single (not 
uncommonly, unimportant) decision. Obviously, the purpose is to 
distribute responsibility to "everyone" and diffuse it by emphasizing 

that the decision is unanimously agreed upon.4 The existence of a 
"chuukai-sha" (go-between) also helps avoid direct confrontation 
between two parties (note: chuukaisha are ordinary people, not 
lawyers), and thus works to save both parties' face (e.g. Gudykunst, 1993, 

428 


1994). Obviously the nemawashi and ringi systems function in uchi 
group situations, and the go-between system functions between groups. 
These systems demonstrate how the Japanese culture values wa 
(harmony of people) within the group, and avoids direct confrontations 
with other groups. Both respect of wa and avoidance of confrontation 
establish the foundation for less assertive linguistic behavior of 
Japanese. 

In this research, it is found that Japanese speakers use direct 
forms more frequently than expected. However, it is also found that the 
use of simple direct forms is limited; speakers preferred to add some 
kind of indirect or semi-indirect modality to direct forms which 
tentatively called "direct question forms", "sharing forms", or "rapport 
forms". Even when a speaker's commitment to the proposition is high, 
he often includes a questioning flavor in the sentence-ending-
confirming if the hearer agrees with him, or reminding that the 
hearer has the same information, or even genuinely questioning if he 
is right---using the shared concept of hearer-sensitive evidentiality. 
These phenomena exist among uchi members. Towards soto information, 
usually the distance that a speaker perceives between himself (or his 
uchi world) and the topic is expressed through indirect evidentiality, 
particularly, in formal situations. Thus, I have explained it as a 
consequence of politeness based on territory consciousness. As I wrote 
in note 1 of this chapter, Japanese people appear to be apathetic towards 
soto members. Although there is a frequently quoted proverb that says 

429 


an individual must anticipate three enemies once he leaves his home, in 
actuality, Japanese people are not that hostile to strangers. But they 

certainly are unwilling to interact unless absolutely necessary.5 If 
interaction is required, the politeness of linguistic indirection is 
commonly maintained towards soto speakers as the standard 
evidentiality model suggests. 

In this work, I have attempted to identify the Japanese cultural 
phenomena which appear to be related with the use of proper 
evidentiality coding which I tried to propose in an organized way. 
Provided with sufficient accurate information, people can learn other 
cultures and accept them at least on the surface. The use of evidential 
codings appropriate to each speech situation is a part of Japanese 
cultural behavior, as it is necessary to correctly express the Japanese 

concept of human relationship in linguistic forms. Without 
understanding and using the concept of evidentiality, one can not 
produce culturally appropriate utterances in Japanese. 

LIMITATIONS OF THE STUDY 

The study had several limitations which further studies might 
address: 

First, although I believe that the informants as a whole represent 
the Japanese community to a considerably high extent, since it is a 

430 


group of "convenience" samples gathered from my associates, they may 
well represent Japanese speakers from "my" linguistic environment 
rather than the entire Japanese speech community. Although Japanese 
society is highly homogeneous, there are likely regional and class 
differences within the scope of this research. Therefore, a random 
sampling from a wider population will certainly be necessary for more 
reliable data. 

Second, since I wanted to acquire a model which could be 
generalized, the quantity of analyzed data superseded the deeper 
qualitative analysis of each individual discourse or utterance. Although 
the method I used may serve the purpose of this study, it is undeniable 
that "deep analysis" of a limited number of interactions for each 
discourse type, for example, might have revealed different research 
results. In this sense, my analysis may have fallen short of 
understanding the deep meaning of the speakers' evidential usages. The 
best method, I suppose, would involve an informant's explanation and 
retrospective analysis of his own speech behavior, something which 
this study could not attain to a sufficient degree. 

Third, for the same reason as above, I had to simplify the types of 
discourse: six discourse types were considered to represent the parts of 
various speech situations. In a sense, I have attempted to view Japanese 
speech behavior through a limited set of six types of human interaction. 
Naturally, there are plenty of additional speech situations that were not 
considered in this research. Moreover, a finer stratification based on 

431 


additional situational variables seemed to be desirable in each discourse 
type. Regarding the "formal discussion" genre, for example, I have 
realized that there possibly were a number of situational features of this 
genre, which influence the speaker's use of evidentials. Although all 
formal discussion discourses which were analyzed in this genre had two 
basic common features, i.e., "high formality" and "group discussion", 
variables among participants' relationship such as power, affinity, 
familiarity, and also each person's personal psychological traits in 
interacting with others, seemed to affect their use of linguistic 
evidentiality. It was not possible to include these finer differences 
sufficiently to make complete observations due to the expected 
extraneous work of analysis, therefore the simple categorization of 
formal vs. informal was chosen. I can defend this method by pointing 
out that the speaker's "perceived distance" among himself (his 
knowledge or information territory), the referent, and the hearer (his 
knowledge or information territory), which also decides the level of 
appropriate politeness that the speaker perceives, is theoretically 
inclusive of all situational features. However, it is still difficult to tell 
what distance a speaker preceives from the various features he is 
facing. We may possibly gain some answers to this question through 
consulting the speaker himself. 

Forth, although the setting of the proposition types was 
performed based on the earlier analysis of empirical data, six basic types 
of proposition can still be too simplistic to represent the topics of the 

432 


entire speech with epistemic modality. The classification was done in 
relation with the theory of speaker's information territory which 
believe, is a theoretically and also empirically meaningful framework; 
however, there, of course, must be an infinite number of other 
perspectives that can be used to categorize the speaker's propositions in 
order to study evidentiality coding even within the framework of the 
theory of speaker's information territory. As the traditional analysis of 
evidentiality has been criticized due to its dependency on "truth or 
false" aspect of the proposition, the viewpoint of this research may have 
some points which require reassessment. 

These and other thoughts on possible limitations suggest that a 
close qualitative analysis of fewer discourses will show different aspects 
of this research, which may provide future direction to this course of 
study. 

433



CHAPTER 7: NOTES 

1In contrast, Japanese people are sometimes said to be exclusive 
and hostile towards soto. Ando (1984) refers to Japanese tourists' 
shameful behavior in foreign countries, Japanese people's careless 
attitudes towards keeping public places clean, bullying alienated pupils 
at schools, and so on. Most characteristically, I believe, Japanese people 
avoid interaction with strangers as much as possible. That is totally 
different from American mainstream culture in which strangers often 
engage in friendly conversations in elevators, on public transportation, 
and in other public places. This demonstration of instantaneous 
friendliness with strangers usually amazes Japanese people visiting 

America. 
Japanese governmental policy also exposes the same kind of 
exclusive tendency. For example, it is extremely difficult for non-
Japanese people to obtain Japanese nationality, and Japan rarely 
receives permanent immigrants. 

Ando cited Watsuji (1936:165) saying that in Japanese life which 
has been centered on "home", people did not learn to assert individual 
rights, and at the same time, did not come to realize their responsibility 
towards public life (soto). Japanese people developed delicate 
interpersonal emotions such as omoiyari (consideration), hikaeme 
(modesty), and itawari (concern and care) which are only benevolent 
in an uchi relations, and not strong enough for the soto world where 
they did not share warm emotions with outsiders. Thus, people came to 
feel surrounded by enemies once they step out of their home. 

I agree with Ando's comments that the truth of Watsuji's 
observation is still valid today after sixty years. 

2I believe that this emphasis on homogeneity is explicitly taught 
through everyday life. When I was a child, the most common reason 

434 


which I was told that I should not do something was that other children 
did not do so. School regulations for clothing, grooming, and after-
school activities were extremely detailed (I think they are still so today). 
It is a culture that emphasizes "negation of self" instead of "assertion of 
self" as seen in, for example, American main stream culture. Emphasis 
on homogeneity may work effectively to achieve the holistic purpose of 
a group but there is a danger of creating members who are unable to 
perform independently or who are not willing to explore their creative 
potential to the maximum. I feel that this is a serious disadvantage of 
being a Japanese. 

3Ting-Toomy explained as follows: 

Cultural cognitive constraints refer to belief systems or ideologies 
that prevent or discourage group members from cognitively 
thinking in a particular direction. Cultural emotional constraints 
arise from cultural norms that dictate what sorts of emotional 
expressions (such anger, frustration, or grief) are acceptable or 
unacceptable to be outwardly displayed in the public cultural 
context. Finally, cultural behavioral constraints refer to cultural 
rules and codes that govern the behavioral appropriateness of a 
given gesture, or words and phrases in a given socio-cultural 
context. Hence, a low cultural demand/low cultural constraint 
system represents a diverse heterogeneous cultural paradigm (for 
example, U.S. culture); a relatively unified, homogeneous cultural 
paradigm (for example, Japanese culture). (p. 74) 

4 In large Japanese organizations, most documents have 

designated locations for reviewers to stamp their "seals" (a seal is used as 
a signature in Japan). The bigger the organization is, the more often 
members in high level positions are required to affix their seals on 
documents, possibly hundred times a day; so, often they simply affix 

their seals without reading the document. This is called mekura-ban 
(blind seal). Therefore, in reality, the usefulness of "ringi-sei" is 
questionable. However, its surface function is still valued. 

435



5That Japanese people are apathetic towards strangers (soto 
people), and that Japanese people feel responsibility towards society for 
their family's crime, may seem contradictory. However, they can be 
explained by relativity of uchi concept. In the latter case, society as a 
whole is regarded as uchi in relation with the enhanced responsibility 
of an individual or a family as a member of society, while in normal 
circumstances, unknown people are regarded as outsiders. 

436



437



Appendix A 

List of informants' code, age, and discourse type. 

Informant: f01 Age: 40s Discourse Type: informalgroup 
Informant: f02a Age: 40s Discourse Type: informalgroup 
Informant: f02b Age: 40s Discourse Type: informalgroup 
Informant: f03a Age: 40s Discourse Type: informalgroup 
Informant: f03b Age: 40s Discourse Type: formalgroup 
Informant: f04 Age: 40s Discourse Type: informalgroup 
Informant: f05a Age: 40s Discourse Type: informalgroup 
Informant: f05b Age: 40s Discourse Type: formalgroup 
Informant: f05c Age: 40s Discourse Type: family 
Informant: f06 Age: 20s Discourse Type: informalgroup 
Informant: f07 Age: 20s Discourse Type: informalgroup 
Informant: f08 Age: 20s Discourse Type: informalgroup 
Informant: f09 Age: 20s Discourse Type: formalgroup 
Informant: f10 Age: 30s Discourse Type: informalgroup 
Informant: f11 Age: 20s Discourse Type: informalgroup 
Informant: f12 Age: 40s Discourse Type: informalgroup 
Informant: f13 Age: 40s Discourse Type: family 
Informant: f14 Age: 40s Discourse Type: family 
Informant: f15 Age: 40s Discourse Type: family 
Informant: f16 Age: 30s Discourse Type: family 
Informant: f17 Age: 20s Discourse Type: public 
Informant: f18 Age: 20s Discourse Type: public 
Informant: f19 Age: 20s Discourse Type: formal 
Informant: f20 Age: 30s Discourse Type: formal 
Informant: f21 Age: 20s Discourse Type: formal 
Informant: f22a Age: 60s Discourse Type: formal 
Informant: f22b Age: 60s Discourse Type: formal 
Informant: f22c Age: 60s Discourse Type: formal 
Informant: f22d Age: 60s Discourse Type: formal 
Informant: f23 Age: 20s Discourse Type: formal 
Informant: f24 Age: 50s Discourse Type: courtprosecutor 
Informant: f25 Age: 40s Discourse Type: schoolteacher 
Informant: f26 Age: 40s Discourse Type: schoolteacher 
Informant: f27 Age: 10s Discourse Type: family 
Informant: f28 Age: 10s Discourse Type: family 
Informant: f29 Age: 10s Discourse Type: family 
Informant: m01 Age: 70s Discourse Type: informalgroup 
Informant: m02 Age: 30s Discourse Type: informalgroup 
Informant: m03 Age: 30s Discourse Type: informalgroup 
Informant: m04 Age: 40s Discourse Type: family 
Informant: m05 Age: 30s Discourse Type: family 
Informant: m06 Age: 20s Discourse Type: formal 
Informant: m07 Age: 30s Discourse Type: formal 
Informant: m08 Age: 30s Discourse Type: formal 
433 


Informant: m09 
Informant: m10 
Informant: m11 
Informant: m12 
Informant: m13 
Informant: m14 
Informant: m15 
Informant: m16 
Informant: m17 
Informant: m18 
Informant: m19 
Informant: m20 
Informant: m21 
Informant: m22 
Informant: m23 
Informant: m24 
Informant: m25 
Informant: m26 
Informant: m27 
Informant: m28 
Informant: s01 
Informant: s02 


Age: 30s 
Age: 70s 
Age: 40s 
Age: 40s 
Age: 40s 
Age: 40s 
Age: 60s 
Age: 30s 
Age: 70s 
Age: 50s 
Age: 50s 
Age: 80s 
Age: 50s 
Age: 40s 
Age: 10s 
Age: 10s 
Age: 10s 
Age: 30s 
Age: 20s 
Age: 40s 
Age: 10s 
Age: 
8 


Discourse Type: formal 
Discourse Type: formal 
Discourse Type: formal 
Discourse Type: formal 
Discourse Type: formal 
Discourse Type: formal
Discourse Type: formal 
Discourse Type: formal 
Discourse Type: courtdefendant 
Discourse Type: courtprosecutor 
Discourse Type: courtprosecutor 
Discourse Type: courtdefendant 
Discourse Type: courtprosecutor 
Discourse Type: family 
Discourse Type: family 
Discourse Type: family 
Discourse Type: family 
Discourse Type: public 
Discourse Type: public 
Discourse Type: public 
Discourse Type: schoolstudents 
Discourse Type: schoolstudents 


434



Appendix B 
Sentence-ending forms 

Direct endings 

GROUP I sentence-ending forms are the most direct sentence-endingforms including direct-forms of verbs, adjectives, and copula, and 
simple noun utterances. These direct forms are followed by vocative 
sentence-ending suffixes, eg, -yo, -no, and -sa. 

In English, it is difficult to show the difference in meanings in these 
ending forms, but roughly, nouns and simple direct endings are 
"direct". Vocative final particles extend the speaker's conviction (i.e., I 
am telling you), -n+da cluster and 
-wake have an "explaining" nuance (i.e., if you understand), and 
conjunctive-endings, -kara, -node, -kedo, and -ga show direct modality 
with pretended hesitancy. 

Noun. (informal)
D (direct) (informal, formal) D n+da (informal, formal)
D yo. (informal, formal) D n+da+yo. (informal, formal)
D wa+yo. (informal, formal)
D no+yo. (informal, formal)
D wake+yo. (informal, formal) D wake+na+n+da+yo
. 


(informal, formal) 
D wa. (informal, formal) 
D sa. (informal, formal) 
D no. (informal, formal) D n+da+mo+no. 

(informal, formal) 
D wake (da). (informal, formal) D wake+na+n+da 

(informal, formal) 
D kara/node. (informal, formal) D n+da+kara/node. 

(informal, formal) 
D kedo/ga. (informal, formal) D n+da+kedo/ga. 

(informal, formal) 
D wake+da+kara (informal, formal) 
D wake+da+kedo (informal, formal) 

GROUP 2 sentence endings include endings that use -ne . with a falling 
intonation. Endings of this group are also direct, but are hearer-
conscious in that the use of a falling -ne aims to draw the hearer's 
attention to the speech.

 In English, the meaning of all the following forms is to attract 
hearer's attention (e.g. you know, you see). 

D no+ne. (informal, formal) D n+desu+no+ne. (formal) 
D kara+ne. (informal, formal) D n+da+kara+ne. 

(informal, formal) 

435 


D kedo/ga+ne. (informal, formal) D n+da+kedo/ga+ne. 

(informal, formal) 
D wa+ne. (informal, formal) 
D ne. (informal, formal) D n+da+ne. (informal, formal) 
D yo+ne. (informal, formal) D n+da+yo+ne. 

(informal, formal) 
D wa+yo+ne. (informal, formal) 
D no+yo+ne.(informal, formal) 
D wake+ne. (informal, formal) 
D wake+yo+ne. (informal, formal) D wake+na+n+da+yo+ne. 

(informal, formal) 
D na. (informal, formal) 
D naa. (informal, formal) 
D yo+na. (informal, formal) 

GROUP 3 sentence endings, daroo. and janai.

 and related forms, are 
also direct evidentials but are sensitive to the hearer's knowledge in 
checking or confirming the hearer's knowledge. 

The general meaning of these endings is that of a tag-question (i.e., 
isn't it.) which is not actually asking for the hearer's agreement. 

SD CONFIRM daroo. (informal, formal) SD n+ CONFIRM daroo. 

(informal, formal) 
SD janai. (informal, formal) SD n+janai. (informal, formal) 
Q janai+no. (informal) 
Q janai+ka. (informal, formal) Q janai+no+ka. 

(informal, formal) 
Q janai+ka+na. (informal) Q n+janai+ka. 
(informal, formal) 

GROUP 4 sentence endings are direct sentences with a questioning 
tone. The speaker asks for the hearer's agreement to his speech with 
these ending forms.

 So the meaning of the following is, in general, a tag-question such as 
isn't it? 

DQ ne. (informal, formal) DQ n+da+ne. (informal, formal) 
DQ yo+ne. (informal, formal) DQ n+da+yo+ne. 
(informal, formal) 
DQ kara/node ne.(informal, formal) DQ n+da+kara/node+ne. 

(informal, formal) 
DQ yo+na. (informal, formal) 
DQ janai/jan. (informal, formal) DQ n+janai. (informal, formal) 
Q janai ka. (informal, formal) Q janai+no+ka. 

(informal, formal) 
Q janai no. (informal, formal) Q n+janai no. (informal, formal) 

436



DQ CONFIRM daroo. DQ n+CONFIRM daroo. 

(informal, formal) 
DQ CONFIRM daroo+ne. DQ n+CONFIRM daroo + ne. 
Q -(da)kke. 
QD Quasi-question intra-sentential rising phrase. 
QD Quasi-question sentence-ending. 

GROUP 5 sentence-endings (ne#)emphasize the common knowledge 
between the speaker and the hearer. The meeanings of the following 

forms are, thus, as we both know.
SD ne# (informal, formal)
SD yo+ne# (informal, formal)
SD no+ne# (informal, formal)


SD kara(node)+ne# (informal, formal)S 
SD kedo(keredo)+ne# (informal, formal) 
SD janai+ne# (informal, formal) 

SD n+da+ne# 

(informal, formal) 
DS n+da+yo+ne# 

(informal, formal) 

SD n+dakara+ne# 
(informal, formal) 

GROUP 6 sentence endings are question forms that request for new
information.


Q kashira. (informal, formal)
Q ka.(informal, formal) Q no+(desu)+ka
.


(informal, formal) 
Q ka+na. (informal, formal) 
Q ka+ne. (informal, formal) Q no+(desu)+ka+na . 

(informal, formal) 
Q CONJ daroo+ka. (informal, formal) Q n+CONJ daroo+ka. 
(informal, formal) 
Q CONJ deshoo+ka+ne. (informal, formal) Q n+CONJ daroo+ka+ne 

(informal, formal) 
Q Direct ending.(informal, formal) 
Q Noun. (informal) 
Q ka.(informal, formal) D no+ka. (informal, formal) 
Q no. (informal, formal) 
Q wake.(informal, formal) 
Q ka+na. (informal, formal) 
Q no+ne. (informal, formal) Q n+(desu) ka+ne. 

(informal, formal) 
Q ka+ne. (informal, formal) 
Q -kke. (informal, formal) D n+dakke.(informal, formal) 

437



Indirect endings 

GROUP 7 sentence-endings are indirect in meaning and have 
syntactically indirect structures. Group 7 forms express propositions 
inferred from indirect evidence. Mitai, yoo, and rashii mean looks like, 
seems like. or appears to be. Rashii can be a hearsay evidential too. 

ID mitai/yoo (informal, formal) ID mitai/yoona+N+da (informal,
formal)
ID mitai/yoo+yo (informal, formal)
ID mitai/yoo+na+no (informal)
ID mitai/yoo+da+kedo (informal, formal)
ID mitai/yoona+n+da+kedo (informal, formal)
ID mitai/yoo(da)+ne.(informal, formal) 
ID mitai/yoona+n+da+ne.(informal, formal)
ID mitai/yoo+yo+ne. (informal, formal)
ID mitai/yoo na+no+ne. (informal, formal) 
ID mitai/yoo(da)+ne.(informal, formal) 
ID mitai/yoona+n+da+ne.(informal, formal)
ID mitai/yoo(da) ne# (informal, formal)
ID mitai+janai. (informal, formal)


ID rashii (informal, formal)
ID rashii+yo (informal, formal) ID rashii N da yo 


(informal, formal) 
ID rashii+no (informal, formal) 
ID rashii+no yo (informal, formal) 
ID rashii+na (informal, formal) 
ID rashii + kedo/ga (informal, formal) ID rashii n da kedo/ga 

(informal, formal) 
ID rashii + kara/node (informal, formal) 
ID rashii + kedo/ga ne. (informal, formal) ID rashii n da kedo/ga ne. 

(informal, formal) 
ID rashii+yo ne. (informal, formal) 
ID rashii+ne. (informal, formal) ID rashii n da ne. 

(informal, formal) 
ID rashii+no ne. (informal, formal) 
ID rashii+ne. (informal, formal) ID rashii n da ne. 

(informal, formal) 
ID rashii n da yo ne# 
(informal, formal) 

GROUP 8 sentence endings are indirect in meaning and construct 
syntactically indirect structure. Group 8 forms express that the 
proposition is second-hand information. 

(1) -(da)tte means 'it is said such and such'. It directly transfers second
438



hand information without modification. 

ID -datte, -tte, etc (informal, formal) ID n+datte 
(informal, formal) 
ID n datte + yo 
(informal, formal) 
ID (da)tte+ne. (informal, formal) ID n datte + ne. 
(informal, formal) 
ID (da)tte+ne# (informal, formal) 

(2) -(da)soo(da) means it is said so or I heard so. 
ID (da) soo (da) (informal, formal) ID n da soo (da) 

(informal, formal) 
ID (da) soo (da) ne. (informal, formal) 
ID (da) soo (da) ne. (informal, formal) 
ID (da) soo (da) ne# (informal, formal) 
ID (da) soo dakedo (informal, formal) 

(3) -to kiita (I heard so), -to iwareteiru, -to iu hanashi, and others all 
mean I heard or it is said. For convenice, -kiita is used to represent all 
of them. 
ID -to kiita (informal, formal) ID -to kiita N da 
(informal, formal) 
ID -to kiita+yo (informal, formal) ID -to kiita n da yo 
(informal, formal) 
ID -to kiita+kedo (informal, formal) ID -to kiita n da kedo 
(informal, formal) 
ID -to kiita+kedo ne. (informal, formal) ID -to kiita n da kedo ne. 
(informal, formal) 
ID -to kiita + ne. (informal, formal) ID -to kiita n da ne. 

(informal, formal) 
ID -to kiita + no. (informal, formal) 
ID -to kiita + no ne. (informal, formal) 
ID -to kiita + no yo. (informal, formal) 
ID -to ka. 

GROUP 9 endings are epistemic auxiliaries. 

(1) Kamoshirenai means might be. The degree of necessity of 
propositional truth is low in the speaker's judgement. 
AUX kamoshirenai/kamo (informal, formal) 
AUX kamoshirenai+na (informal, formal) 

439 


AUX kamoshirenai+yo (informal, formal) AUX kamoshirenai n da yo 

(informal, formal) 
AUX kamoshirenai node/kara(informal, formal) 
AUX kamoshirenai kedo/ga (informal, formal) 
AUX kamo- + yo ne. (informal, formal) AUX kamo- n da + yo ne. 

(informal, formal) 
AUX kamoshirenai+ne. (informal, formal) 
AUX kamoshirenai+ne.(informal, formal) 

AUX kamo- n da kedone. 

(informal, formal) 
AUX kamo-+kedo + ne. (informal, formal) 
AUX kamoshirenai+kedo+ne. (informal. formal) 
AUX kamoshirenai+ne#(informal, formal) AUX kamo- n da ne# 

(informal, formal) 
AUX kamo- + yo ne.(informal, formal) AUX kamo- n da + yo ne. 
(informal, formal) 
AUX kamoshirenai + janai 

(2) Hazu(da) means "it must be such and such based on some evidence" 
expressing the speaker's strong belief in the necessity of the 
proposition. 
AUX hazu(da) (informal, formal) 
AUX hazu(da) yo (informal, formal) AUX hazu na n da yo 
(informal, formal) 
AUX hazu na n da kedo 
(informal, formal) 
AUX hazu na n da ne. 
(informal, formal) 
AUX hazu+CONFIRMdaroo. (informal, formal) 

(3) -Ni chigainai (i.e., it must be so, there is no mistake about it) 
provides an inference with strong conviction. This type was used only 
once in the data. 
AUX -ni chigainai (informal, formal) 

(4) "Conjecture daroo" menas probably. 
AUX CONJ daroo. (informal, formal)
AUX CONJ daroo + kedo/ga AUX n+ daroo kedo/ga 


(informal, formal) (informal, formal) 
AUX CONJ daroo + kara (informal, formal) 
AUX CONJ daroo ne. (informal, formal) AUX n+ daroo ne. 

(informal, formal) 
AUX CONJ daroo na. (informal, formal) 
AUX CONJ daroo+kedo+ne.(informal, formal) 

440 


AUX CONJ daroo + kedo + ne. (informal, formal)
AUX CONJ daroo ne#(informal, formal) AUX n+daroo ne#


(informal, formal) 

GROUP 10 includes endings meaning I think. These evidentials 
indicate that the proposition is speaker-subjective. The speaker's 
commitment to the proposition is high with this group, but subjective 
nature of inferences is emphasized. 

Lexical items which are related with "thought" are all involved. 
Omou (think), omot-teiru (think-tentative), kangaeru, kangae-teiru 
(think), rikaisuru, rikaishi-teiru (understand), kanjiru, kanji-teiru 
(feel) etc. All listed ending forms here use omou for convenience. 

ID omou/omotteiru (informal, formal) ID omou/omotteiru n da 

(informal, formal) 
ID omowareru (informal, formal) 
ID omou yo (informal, formal) ID omou n da yo 
(informal, formal) 
ID omou wa (informal, formal) 
ID omou wake (da) (informal, formal) 
ID omou wa yo (informal, formal) 
ID omou no (informal, formal) 
ID omou na. (informal, formal) ID omou n da na. 
(informal, formal) 
ID omou kara/node (informal, formal) ID omou n da kara 
(informal, formal) 
ID omou kedo/ga (informal, formal) ID omou n da kedo/ga 
(informal, formal) 
ID omou kedo ne. (informal, formal) ID omou n da kedo ne. 
(informal, formal) 
ID omou no yo.(informal, formal) 
ID omou no ne. (informal, formal) 
ID omou no yo ne. (informal, formal) 
ID omou ne. (informal, formal) ID omou n da ne. 
(informal, formal) 
ID omou yo ne. (informal, formal) ID omou n da yo ne. 
(informal, formal) 
ID omou yo ne .(informal, formal) 
ID omou wake da yo ne. (informal, formal) 
ID omou no ne.(informal, formal) 
Q omou. (informal, formal) 
Q omowanai. (informal, formal) 

441



Appendix C 

Meanings of grammatical evidentials by Willett (1988:96) 

I. Direct evidence: the speaker claims to have perceived the situation 
described, but may not specify that it is sensory evidence of any kind. 
A. Visual evidence: the speaker claims to have seen the situations 
described. 
B. Auditory evidence: the speaker claims to have heard the situations 
described. 

C. Sensory evidence: the speaker claims to have physically sensed 
the situation described. This can be viewed as (a) in opposition to 
one or both of the above senses(i.e. any other sense), or (b) 
unspecified as to sensory mode (i.e. any sense). 
II. Indirect evidence: the speaker claims not to have perceived the 
situation described, but may not specify whether the evidence he does 
have is reported to him or is the basis of an inferences he has made. 
A. Reported evidence: the speaker claims to know of the situation 
described via verbal means, but may not specify whether it is 
hearsay (i.e. second-hand or third-hand), or is conveyed through 
folklore. 
1. Second-hand evidence: the speaker claims to have heard of the 
situation described from someone who was a direct witness. 
2. Third-hand evidence: the speaker claims to have heard about 
the situation described, but not from a direct witness. 
3. Evidence from folklore: the speaker claims that the situation 
described is part of established oral history. 
B. Inferring evidence: the speaker claims to know of the situation 
described only though inference, but may not specify whether such 
inference is based on observable results or solely on mental 
reasoning. 
1. Inference from the results: the speaker infers the situation 
described from his observable evidence. 
2. Inference from reasoning: the speaker infers the situation 
described on the basis of intuition, logic, a dream, previous 
experience, or some other mental construct. 

442



443



444



445



Appendix E-1 

Speaker F2 ("normal" discourse), Informal friend discourse, occurrence 
of ending forms for each proposition type. 

Informant: f02a Age: 40s Discourse Type: informalgroup 


Information type 
A


D daroo descending informal 


DQ daroo ascending informal 


D (direct) informal 


DQ ja nai ascending informal 


D n dakara informal 


D n dakedo ne descending informal 


D n da yo informal 


D no descending informal 


D no ne descending informal 


D noun informal 


D no yo informal 


D wa yo informal 


D yo informal 


D yo ne descending informal 


id omou informal 
Information type 
B


D daroo descending informal 


D ja nai descending informal 


D no ne descending informal 


DQ quasi-q intra 
Information type 
C


DQ daroo ascending informal 


D (direct) informal 


DQ ja nai ne # informal 


Q ja nai no ascending informal 


DQ ja nai ascending informal 


DQ kara ne ascending informal 


D ne # informal 


DQ ne ascending informal 


D no descending informal 


DQ quasi-q intra 


D yo ne # informal 


D yo ne descending informal 


DQ yo ne ascending informal 


q -kke ascending informal 
Information type 
D


q direct ascending informal 
Information type 
E
Information type 
F


DQ daroo ascending informal 


446 

: 
1
: 
2
: 31
: 
1
: 
2
: 
1
: 
2
: 25
: 
4
: 
3
: 10
: 
3
: 16
: 
1
: 
1


: 
1
: 
5
: 
1
: 
2


: 
3
: 
1
: 
1
: 
1
: 11
: 
2
: 
1
: 
1
: 
2
: 
2
: 
4
: 
1
: 
1
: 
2


: 
1


: 
1



D (direct) informal : 
6
AUX hazu (da) yo informal : 
1
D ja nai descending informal : 
1
Q ja nai no ascending informal : 
1
DQ ja nai ascending informal : 
7
AUX kamoshirenai informal : 
2
D kara informal : 
1
D n dakara informal : 
1
D n dakedo ne descending informal : 
1
D n da mono informal : 
1
DQ n daroo ascending informal : 
1
D ne descending informal : 
1
DQ n ja nai ascending informal : 
1
D noun informal : 
3
D no yo informal : 
6
D no yo ne descending informal : 
1
DQ quasi-q intra : 
2
D yo informal : 
3
D yo ne descending informal : 
1
id (da) tte informal : 15
id -to kiita no ne descending informal : 
1
id n da tte formal : 
3
id n da tte informal : 
7
id omou informal : 
2
id omou n da kedo ne descending informal : 
1
id omou ne descending informal : 
1
id omou no yo descending informal : 
1
id omou no yo ne descending informal : 
1
id omou wa yo formal : 
1
id omou yo informal : 
2


Information type 
G
D (direct) informal : 
2
D ja nai descending informal : 
1
DQ ja nai ascending informal : 
3
D kara informal : 
3
D n da informal : 
2
D n da mono informal : 
1
D n da yo ne descending informal : 
1
D ne descending informal : 
2
DQ ne ascending informal : 
1
D no descending informal : 
3
D noun informal : 
3
DQ quasi-q ending : 
1
DQ quasi-q intra : 
4
D wa informal : 
1
D wake informal : 
2
D yo informal : 
7


447 


D yo ne descending informal : 
3
id (da) tte informal : 
7
id -toka descending informal : 
1
id -to kiita informal : 
1
id -to kiita kedo ne descending informal : 
1
id -to kiita ne descending informal : 
2
id -to kiita no ne descending informal : 
2
id -to kiita no yo descending informal : 
1
id mitai (da) yo informal : 
1
id n da tte formal : 
6
id n da tte informal : 
1
q -kke ascending informal : 
1
q direct ascending informal : 
1
q n da kke ascending informal : 
2


Information type 
H
AUX conjecture daroo descending informal : 
1
D na descending informal : 
1


448



Appendix E-2 

Speaker F2 ("Reporter" discourse), Informal friend discourse, 
occurrence of ending forms for each proposition type. 

Informant: f02b Age: 40s 
Information type 
A
D (direct) informal 
D noun informal 
D no yo informal 
D yo informal 
Information type 
B
Information type 
C


Discourse Type: informalgroup 


: 
3
: 
1
: 
1
: 
2


DQ daroo ascending informal : 
2


Information type 
D
q direct ascending informal : 
1
q noun ascending informal : 
1


Information type 
E


Information type 
F
D daroo descending informal : 
1
DQ daroo ascending informal : 
2
D (direct) informal : 
5
DQ ja nai ascending informal : 
1
D kara informal : 
1
D n dakara informal : 
1
D n da mono informal : 
1
D ne descending informal : 
2
DQ n ja nai ascending informal : 
1
D no descending informal : 
7
D noun informal : 
3
D no yo informal : 
7
D wake informal : 
1
D wake yo informal : 
1
D yo informal : 
5
id (da) tte informal : 
3
id -to kiita informal : 
1
id -to kiita kedo ne descending informal : 
1
id mitai na n da informal : 
1
id mitai (da) yo informal : 
3
id n da tte formal : 
9
id n da tte informal : 
2


Information type 
G
DQ daroo ascending informal 
: 


1
Information type 
H


449



Appendix F 
Occurrence of ending forms by group for each proposition type for 

each discourese type. 

Discourse type: formal 
Info type:Male+Female+Student=Total


a: 590 + 474 + 0 =1064
b: 29 + 17 + 0 = 46
c: 94 + 182 + 0 = 276
d: 17 + 169 + 0 = 186
e: 13 + 138 + 0 = 151
f: 87 + 82 + 0 = 169
g: 14 + 14 + 0 = 28
h: 27 + 46 + 0 = 73 
Total: 871 +1122 + 0 =1993
Discourse type: public 
Info type: Male+Female+Student=Total


a: 88 + 150 + 0 = 238
b: 5 + 6 + 0 = 11
c: 4 + 47 + 0 = 51
d: 3 + 21 + 0 = 24
e: 0 + 10 + 0 = 10
f: 17 + 39 + 0 = 56
g: 0 + 8 + 0 = 
8
h: 3 + 0 + 0 = 
3 
Total: 120 + 281 + 0 = 401
Discourse type: friend 
Info type: Male+Female+Student=Total


a: 276 + 527 + 0 = 803
b: 12 + 33 + 0 = 45
c: 40 + 151 + 0 = 191
d: 24 + 107 + 0 = 131
e: 16 + 35 + 0 = 51
f: 86 + 258 + 0 = 344
g: 74 + 211 + 0 = 285
h: 14 + 40 + 0 = 54 
Total: 542 +1362 + 0 =1904
450



Discourse type: family 
Info type: Male+Female+Student=Total


a: 240 + 380 + 0 = 620
b: 10 + 31 + 0 = 41
c: 75 + 172 + 0 = 247
d: 40 + 110 + 0 = 150
e: 10 + 34 + 0 = 44
f: 129 + 89 + 0 = 218
g: 71 + 40 + 0 = 111
h: 11 + 20 + 0 = 31 
Total: 586 + 876 + 0 =1462
Discourse type: courtprosecutor 
Info type:Male+Female+Student=Total


a: 46 + 6 + 0 = 52
b: 5 + 4 + 0 = 
9
c: 102 + 19 + 0 = 121
d: 37 + 17 + 0 = 54
e: 55 + 6 + 0 = 61
f: 21 + 6 + 0 = 27
g: 0 + 0 + 0 = 
0
h: 0 + 0 + 0 = 
0 
Total: 266 + 58 + 0 = 324
Discourse type: courtdefendant 
Info type:Male+Female+Student=Total


a: 261 + 0 + 0 = 261
b: 0 + 0 + 0 = 
0
c: 16 + 0 + 0 = 16
d: 0 + 0 + 0 = 
0
e: 6 + 0 + 0 = 
6
f: 22 + 0 + 0 = 22
g: 5 + 0 + 0 = 
5
h: 0 + 0 + 0 = 
0 
Total: 310 + 0 + 0 = 310
451



Discourse type: school 
Info type: Male+Female+Student=Total


a: 0 + 57 +159 = 216
b: 0 + 17 + 2 = 19
c: 0 + 123 + 14 = 137
d: 0 + 96 + 38 = 134
e: 0 + 20 + 6 = 26
f: 0 + 44 + 51 = 95
g: 0 + 0 + 0 = 
0
h: 0 + 3 + 0 = 
3 
Total: 0 + 360 +270 = 630
Discourse type: all 
Info type: Male+Female+Student=Total
a:1501 +1594 +159 =3254


b: 61 + 108 + 2 = 171
c: 331 + 694 + 14 =1039
d: 121 + 520 + 38 = 679
e: 100 + 243 + 6 = 349
f: 362 + 518 + 51 = 931
g: 164 + 273 + 0 = 437
h: 55 + 109 + 0 = 164 
Total:2695 +4059 +270 =7024
452



Appendix G 
Occurrence of ending forms by group for all proposition types. 

Group 1 
823/3247 (25%) D (direct) informal 
487/3247 (14%) D (direct) formal 
242/3247 ( 7%) D no descending informal 
195/3247 ( 6%) D noun informal 
141/3247 ( 4%) D n dakedo formal 
122/3247 ( 3%) D yo informal 
120/3247 ( 3%) D n da formal 
113/3247 ( 3%) D kara informal 
108/3247 ( 3%) D n da yo formal 
99/3247 ( 3%) D kedo formal 
88/3247 ( 2%) D kedo informal 
80/3247 ( 2%) D no yo informal 
78/3247 ( 2%) D n dakedo informal 
67/3247 ( 2%) D kara formal 
60/3247 ( 1%) D wake informal 
57/3247 ( 1%) D wake formal 
53/3247 ( 1%) D sa informal 
53/3247 ( 1%) D n da yo informal 
31/3247 ( 0%) D n da informal 
30/3247 ( 0%) D yo formal 
26/3247 ( 0%) D wake da yo formal 
23/3247 ( 0%) D da yo formal 
15/3247 ( 0%) D dakedo formal 
15/3247 ( 0%) D wa yo informal 
12/3247 ( 0%) D wake dakedo formal 
11/3247 ( 0%) D n dakara formal 
9/3247 ( 0%) D wake da kara formal 
9/3247 ( 0%) D n dakara informal 
7/3247 ( 0%) D n da mono informal 
6/3247 ( 0%) D wa informal 
6/3247 ( 0%) Q omou ascending formal 
6/3247 ( 0%) D no descending formal 
6/3247 ( 0%) D wake na n da yo formal 
6/3247 ( 0%) D wake yo informal 
5/3247 ( 0%) id omou informal 
4/3247 ( 0%) id omou n da kedo formal 
4/3247 ( 0%) noun informal 
4/3247 ( 0%) D n wake yo informal 
3/3247 ( 0%) D wa informal 
3/3247 ( 0%) D wake na n da formal 
3/3247 ( 0%) id omou kedo formal 
2/3247 ( 0%) id omou n da kedo informal 
453 


2/3247 ( 0%) id omou n da yo formal 
2/3247 ( 0%) D wa formal 
2/3247 ( 0%) id omou yo formal 
2/3247 ( 0%) D wa yo formal 
2/3247 ( 0%) Q omou ascending informal 
1/3247 ( 0%) id omou no ne ascending informal 
1/3247 ( 0%) id omou yo informal 
1/3247 ( 0%) id omou n da ne descending formal 
1/3247 ( 0%) id omou yo ne ascending informal 
1/3247 ( 0%) D n da mono descending informal 
Group 2 
130/ 863 (15%) D ne descending formal 
103/ 863 (11%) D n da yo ne descending formal 
98/ 863 (11%) D n da ne descending formal 
95/ 863 (11%) D ne descending informal 
90/ 863 (10%) D no ne descending informal 
59/ 863 ( 6%) D yo ne descending informal 
55/ 863 ( 6%) D yo ne descending formal 
32/ 863 ( 3%) D n dakedo ne descending formal 
24/ 863 ( 2%) D n da yo ne descending informal 
22/ 863 ( 2%) D na descending informal 
16/ 863 ( 1%) D kara ne descending informal 
15/ 863 ( 1%) D wake yo ne descending formal 
14/ 863 ( 1%) D wake ne descending formal 
13/ 863 ( 1%) D kedo ne descending informal 
12/ 863 ( 1%) D kara ne descending formal 
10/ 863 ( 1%) D wa ne descending informal 
10/ 863 ( 1%) D n da ne descending informal 
9/ 863 ( 1%) D n dakedo ne descending informal 
8/ 863 ( 0%) D kedo ne descending formal 
7/ 863 ( 0%) D no ne descending formal 
6/ 863 ( 0%) D wake yo ne descending informal 
5/ 863 ( 0%) D n dakara ne descending formal 
5/ 863 ( 0%) D naa descending informal 
4/ 863 ( 0%) D n da na descending informal 
3/ 863 ( 0%) D wa ne descending formal 
3/ 863 ( 0%) D no yo ne descending informal 
3/ 863 ( 0%) D wake ne descending informal 
3/ 863 ( 0%) D no yo ne descending formal 
2/ 863 ( 0%) D wa yo ne descending informal 
2/ 863 ( 0%) D n desu no ne descending formal 
1/ 863 ( 0%) D yo na informal 
1/ 863 ( 0%) D n dakara ne descending informal 
1/ 863 ( 0%) D n dakedo formal 
1/ 863 ( 0%) D n da no ne descending formal 
1/ 863 ( 0%) D wake na n da yo ne descending formal 
454 


Group 3 
60/ 173 (34%) D daroo descending informal 
59/ 173 (34%) D ja nai descending informal 
14/ 173 ( 8%) D daroo descending formal 
13/ 173 ( 7%) q ja nai ka descending formal 
11/ 173 ( 6%) D n daroo descending informal 
5/ 173 ( 2%) D n daroo descending formal 
3/ 173 ( 1%) D n janai descending informal 
2/ 173 ( 1%) q n ja nai ka descending formal 
1/ 173 ( 0%) D ja nai ka formal 
1/ 173 ( 0%) q ja nai ka na descending informal 
1/ 173 ( 0%) D ja nai descending formal 
1/ 173 ( 0%) AUX n conjecture daroo ne descending formal 
1/ 173 ( 0%) q ja nai no descending informal 
1/ 173 ( 0%) q ja nai no ka descending formal 
Group 4 
124/ 744 (16%) DQ daroo ascending informal 
79/ 744 (10%) DQ ne ascending formal 
73/ 744 ( 9%) DQ ja nai ascending informal 
61/ 744 ( 8%) DQ yo ne ascending formal 
56/ 744 ( 7%) DQ daroo ascending formal 
46/ 744 ( 6%) DQ ne ascending informal 
38/ 744 ( 5%) DQ n daroo ascending informal 
37/ 744 ( 4%) DQ n ja nai ascending informal 
26/ 744 ( 3%) DQ yo ne ascending informal 
25/ 744 ( 3%) DQ n da ne ascending formal 
25/ 744 ( 3%) DQ quasi-q intra 
20/ 744 ( 2%) DQ n da yo ne ascending formal 
16/ 744 ( 2%) DQ no ne ascending informal 
12/ 744 ( 1%) DQ n daroo ascending formal 
11/ 744 ( 1%) q ja nai no ascending informal 
10/ 744 ( 1%) DQ quasi-q ending 
9/ 744 ( 1%) DQ n da yo ne ascending informal 
9/ 744 ( 1%) q ja nai ka ascending formal 
8/ 744 ( 1%) DQ daroo ne ascending formal 
7/ 744 ( 0%) Q yo ne ascending informal 
6/ 744 ( 0%) DQ n daroo ne ascending formal 
6/ 744 ( 0%) q n ja nai no ascending informal 
5/ 744 ( 0%) DQ no ne ascending formal 
5/ 744 ( 0%) Q ja nai no ascending informal 
5/ 744 ( 0%) q ja nai n desuka ascending formal 
3/ 744 ( 0%) DQ n daroo ne ascending informal 
3/ 744 ( 0%) DQ kara ne ascending informal 
2/ 744 ( 0%) DQ n da yo ascending formal 
2/ 744 ( 0%) q -kke ascending informal 
455 


2/ 744 ( 0%) DQ yo na ascending informal 
2/ 744 ( 0%) DQ kara ne ascending formal 
1/ 744 ( 0%) q n ja nai n desu ka ascending formal 
1/ 744 ( 0%) DQ n da ne ascending informal 
1/ 744 ( 0%) DQ daroo ne # informal 
1/ 744 ( 0%) q ja nai no ka ascending formal 
1/ 744 ( 0%) DQ daroo ne ascending informal 
1/ 744 ( 0%) q n ja nai no ka ascending formal 
1/ 744 ( 0%) q n ja nai no ascending formal 
1/ 744 ( 0%) q ja nai kke ascending formal 
1/ 744 ( 0%) DQ n dakara ne ascending formal 
1/ 744 ( 0%) DQ n ja nai no ascending informal 
1/ 744 ( 0%) D ja nai no descending informal 
Group 5 
60/ 219 (27%) D yo ne # formal 
54/ 219 (24%) D ne # formal 
35/ 219 (15%) D ne # informal 
30/ 219 (13%) D yo ne # informal 
7/ 219 ( 3%) D kara ne # formal 
7/ 219 ( 3%) D no ne # informal 
6/ 219 ( 2%) D n da yo ne # formal 
4/ 219 ( 1%) D n da ne # formal 
4/ 219 ( 1%) D n da ne # informal 
3/ 219 ( 1%) D no ne # for 
2/ 219 ( 0%) D kedo ne # informal 
2/ 219 ( 0%) D n da yo ne # informal 
2/ 219 ( 0%) n da yo ne # informal 
1/ 219 ( 0%) D n dakara ne # informal 
1/ 219 ( 0%) DQ ja nai ne # informal 
1/ 219 ( 0%) D kedo ne # formal 
Group 6 
161/ 898 (17%) q direct ascending informal 
132/ 898 (14%) q ka ascending formal 
121/ 898 (13%) q no ascending informal 
81/ 898 ( 9%) q noun ascending informal 
45/ 898 ( 5%) q direct ascending formal 
38/ 898 ( 4%) q n desu ka ascending formal 
36/ 898 ( 4%) q ka na ascending informal 
32/ 898 ( 3%) q ka na descending informal 
27/ 898 ( 3%) q daroo ka descending formal 
26/ 898 ( 2%) q -kke ascending informal 
24/ 898 ( 2%) q ka descending formal 
22/ 898 ( 2%) q n desu ka descending formal 
17/ 898 ( 1%) q ka ascending informal 
17/ 898 ( 1%) q wake ascending informal 
456 


13/ 898 ( 1%) q no ka ascending informal 
11/ 898 ( 1%) q kashira descending informal 
9/ 898 ( 1%) q no ascending formal 
9/ 898 ( 1%) q no ka ascending formal 
7/ 898 ( 0%) q n da ka ascending formal 
7/ 898 ( 0%) q n daroo ka descending formal 
6/ 898 ( 0%) q ka ne ascending formal 
6/ 898 ( 0%) DQ -kke ascending informal 
5/ 898 ( 0%) q ka ne descending formal 
5/ 898 ( 0%) q ka ne ascending informal 
5/ 898 ( 0%) q -kke ascending formal 
5/ 898 ( 0%) q no ka na ascending informal 
4/ 898 ( 0%) q ka descending informal 
4/ 898 ( 0%) q daroo ka ne descending formal 
4/ 898 ( 0%) q direct ascending formal 
2/ 898 ( 0%) q ka na descending formal 
2/ 898 ( 0%) q n desu ka ne descending formal 
2/ 898 ( 0%) q no ne ascending informal 
2/ 898 ( 0%) q daroo ka descending informal 
2/ 898 ( 0%) q wake desu ka ascending formal 
2/ 898 ( 0%) Q no ka descending formal 
2/ 898 ( 0%) q n da kke ascending informal 
1/ 898 ( 0%) q n da -kke ascending 
1/ 898 ( 0%) q kashira descending formal 
1/ 898 ( 0%) Q wake ascending formal 
1/ 898 ( 0%) q n da -kke na descending informal 
1/ 898 ( 0%) q n daroo ka ne descending formal 
Group 7 
16/ 115 (13%) id mitai informal 
10/ 115 ( 8%) id rashii no ne descending informal 
8/ 115 ( 6%) id mitai (da) yo informal 
8/ 115 ( 6%) id rashii informal 
7/ 115 ( 6%) id rashii yo formal 
5/ 115 ( 4%) id mitai da kedo formal 
5/ 115 ( 4%) id mitai formal 
4/ 115 ( 3%) id mitai (da) ne descending formal 
4/ 115 ( 3%) id rashii kedo informal 
3/ 115 ( 2%) id rashii yo informal 
3/ 115 ( 2%) id rashii n da ne ascending formal 
3/ 115 ( 2%) id rashii kedo formal 
2/ 115 ( 1%) id rashii formal 
2/ 115 ( 1%) id rashii n dakedo ne descending formal 
2/ 115 ( 1%) id rashii no descending informal 
2/ 115 ( 1%) id rashii yo ne descending formal 
2/ 115 ( 1%) id rashii n dakedo ne descending informal 
2/ 115 ( 1%) id mitai na n da kedo informal 
457 


2/ 115 ( 1%) id rashii n dakedo formal 
2/ 115 ( 1%) id rashii n da ga formal 
2/ 115 ( 1%) id mitai (da) ne descending informal 
1/ 115 ( 0%) id rashii n da yo formal 
1/ 115 ( 0%) id mitai (da) ne descending informal 
1/ 115 ( 0%) id rashii n dakedo informal 
1/ 115 ( 0%) id mitai na n da ne ascending formal 
1/ 115 ( 0%) id mitai na n da ne descending formal 
1/ 115 ( 0%) id mitai (da) yo formal formal 
1/ 115 ( 0%) id mitai (da) ne ascending formal 
1/ 115 ( 0%) id mitai na no descending informal 
1/ 115 ( 0%) id rashii n da yo ne descending formal 
1/ 115 ( 0%) id mitai na n da formal 
1/ 115 ( 0%) id rashii ne descending informal 
1/ 115 ( 0%) id mitai (da) ne ascending informal 
1/ 115 ( 0%) id rashii no yo informal 
1/ 115 ( 0%) id rashii na descending informal 
1/ 115 ( 0%) id mitai ja nai ascending informal 
1/ 115 ( 0%) id rashii kara informal 
1/ 115 ( 0%) id mitai da ne # formal 
1/ 115 ( 0%) id mitai na n da informal 
1/ 115 ( 0%) id rashii n da ne descending formal 
1/ 115 ( 0%) id rashii no ne ascending informal 
1/ 115 ( 0%) id rashii ne ascending informal 
Group 8 
67/ 299 (22%) id (da) tte informal 
31/ 299 (10%) id n da tte informal 
31/ 299 (10%) id -to kiita informal 
28/ 299 ( 9%) id -to kiita formal 
27/ 299 ( 9%) id n da tte formal 
16/ 299 ( 5%) id -to kiita yo descending informal 
16/ 299 ( 5%) id -toka descending informal 
10/ 299 ( 3%) id n da tte yo formal 
9/ 299 ( 3%) id -to kiita kedo formal 
9/ 299 ( 3%) id (da) soo (da) formal 
6/ 299 ( 2%) id (da) tte formal 
6/ 299 ( 2%) id -to kiita no ne descending informal 
5/ 299 ( 1%) id n da tte ne ascending formal 
4/ 299 ( 1%) id -to kiita ne descending informal 
4/ 299 ( 1%) id (da) soo (da) informal 
4/ 299 ( 1%) id -to kiita kedo informal 
3/ 299 ( 1%) id -to kiita kedo ne descending formal 
2/ 299 ( 0%) id n (da) soo (da) ne ascending formal 
2/ 299 ( 0%) id -to kiita n da kedo formal 
2/ 299 ( 0%) id (da) soo (da) ne descending formal 
2/ 299 ( 0%) id -to kiita kedo ne descending informal 
458 


2/ 299 ( 0%) id (da) soo dakedo formal 
2/ 299 ( 0%) id to kiita n da ne descending formal 
1/ 299 ( 0%) id -to kiita n da yo formal 
1/ 299 ( 0%) id n da soo da formal 
1/ 299 ( 0%) id (da) tte ne ascending informal 
1/ 299 ( 0%) id (da) soo (da) ne # formal 
1/ 299 ( 0%) id (da) soo (da) ne ascending formal 
1/ 299 ( 0%) id (da) tte ne # formal 
1/ 299 ( 0%) id -to kiita n da informal 
1/ 299 ( 0%) id -to kiita no descending informal 
1/ 299 ( 0%) id -to kiita no yo descending informal 
1/ 299 ( 0%) id -to kiita n da kedo informal 
1/ 299 ( 0%) id -to kiita n da formal 
Group 9 
17/ 152 (11%) AUX kamoshirenai informal 
10/ 152 ( 6%) AUX conjecture daroo ne descending formal 
10/ 152 ( 6%) AUX conjecture daroo ne # formal 
9/ 152 ( 5%) AUX conjecture daroo ne descending informal 
9/ 152 ( 5%) AUX conjecture daroo descending formal 
9/ 152 ( 5%) AUX conjecture daroo descending informal 
6/ 152 ( 3%) AUX kamoshirenai kedo formal 
6/ 152 ( 3%) AUX conjecture daroo kedo informal 
5/ 152 ( 3%) AUX kamoshirenai ne ascending informal 
5/ 152 ( 3%) AUX conjecture daroo kedo formal 
4/ 152 ( 2%) AUX kamoshirenai kedo informal 
4/ 152 ( 2%) AUX conjecture daroo ne # informal 
3/ 152 ( 1%) AUX kamoshirenai n da yo ne descnending informal 
3/ 152 ( 1%) AUX hazu (da) formal 
3/ 152 ( 1%) AUX kamoshirenai ne descending informal 
2/ 152 ( 1%) AUX n conjecture daroo ne descending informal 
2/ 152 ( 1%) AUX n conjecture daroo ne descending formal 
2/ 152 ( 1%) AUX conjecture daroo kedo ne descending informal 
2/ 152 ( 1%) AUX kamoshirenai yo ne descending formal 
2/ 152 ( 1%) AUX n conjecture daroo ne descneding informal 
2/ 152 ( 1%) q daroo ka descending formal 
2/ 152 ( 1%) AUX n conjecture daroo descending informal 
2/ 152 ( 1%) AUX kamoshirenai ne ascending formal 
2/ 152 ( 1%) AUX hazu na n da kedo informal 
2/ 152 ( 1%) AUX kamoshirenai kedo ne descending informal 
2/ 152 ( 1%) AUX n conjecture daroo kedo formal 
2/ 152 ( 1%) q n daroo ka descending formal 
1/ 152 ( 0%) AUX kamoshirenai yo formal 
1/ 152 ( 0%) AUX hazu na n da yo ne ascending formal 
1/ 152 ( 0%) AUX conjecture daroo kara informal 
1/ 152 ( 0%) AUX n daroo ne # formal 
1/ 152 ( 0%) AUX hazu na n da yo formal 
459 


1/ 152 ( 0%) AUX -ni chigai nai informal 
1/ 152 ( 0%) AUX kamoshirenai n da ne # formal 
1/ 152 ( 0%) AUX kamoshirenai n dakedo ascending formal 
1/ 152 ( 0%) AUX n conjecture daroo kedo informal 
1/ 152 ( 0%) AUX kamoshirenai n da yo formal 
1/ 152 ( 0%) D conjecture daroo na descending informal 
1/ 152 ( 0%) AUX kamoshirenai yo ne ascending formal 
1/ 152 ( 0%) AUX kamoshirenai node formal 
1/ 152 ( 0%) AUX kamoshirenai na descending informal 
1/ 152 ( 0%) AUX conjecture daroo kedo ne ascending formal 
1/ 152 ( 0%) AUX n conjecture daroo ne # formal 
1/ 152 ( 0%) AUX hazu (da) yo informal 
1/ 152 ( 0%) AUX kamoshirenai kara informal 
1/ 152 ( 0%) AUX n conjecture daroo ne descending formal 
1/ 152 ( 0%) AUX kamoshirenai formal 
1/ 152 ( 0%) AUX hazu deshoo ascending informal 
1/ 152 ( 0%) AUX kamoshirenai ne # formal 
1/ 152 ( 0%) AUX conjecture daroo kedo ne descending formal 
1/ 152 ( 0%) AUX kamoshirenai ja nai ascending informal 
1/ 152 ( 0%) AUX kamoshirenai kedo ne ascending informal 
Group 10 
67/ 314 (21%) id omou formal 
33/ 314 (10%) id omou kedo formal 
32/ 314 (10%) id omou n da kedo formal 
29/ 314 ( 9%) id omou informal 
18/ 314 ( 5%) id omou n da formal 
14/ 314 ( 4%) id omou n da yo ne descending formal 
13/ 314 ( 4%) id omou kedo informal 
10/ 314 ( 3%) id omou n da ne descending formal 
10/ 314 ( 3%) id omou n da kedo informal 
8/ 314 ( 2%) id omou n da yo formal 
8/ 314 ( 2%) id omou no descending informal 
8/ 314 ( 2%) id omou ne descending formal 
7/ 314 ( 2%) id omou kara formal 
7/ 314 ( 2%) id omou kara informal 
5/ 314 ( 1%) id omou yo informal 
5/ 314 ( 1%) id omou wake formal 
4/ 314 ( 1%) id omou na descending informal 
3/ 314 ( 0%) id omou yo ne descending informal 
3/ 314 ( 0%) id omou n da kedo ne descending formal 
3/ 314 ( 0%) id omou no ne descending informal 
3/ 314 ( 0%) id omou wa informal 
3/ 314 ( 0%) id omou no yo descending informal 
3/ 314 ( 0%) id omowareru formal 
3/ 314 ( 0%) id omou wake informal 
2/ 314 ( 0%) id omou ne descending informal 
460 


2/ 314 ( 0%) id omou n da kedo ne descending informal 
2/ 314 ( 0%) id omou n da kara informal 
1/ 314 ( 0%) id omou wake da yo ne ascending formal 
1/ 314 ( 0%) id omowanai ascending informal 
1/ 314 ( 0%) id omou yo formal 
1/ 314 ( 0%) id omou no yo ne descending informal 
1/ 314 ( 0%) id omou wa yo formal 
1/ 314 ( 0%) id omou kedo ne descending informal 
1/ 314 ( 0%) id omou n da informal 
1/ 314 ( 0%) id omou n da na descending formal 
1/ 314 ( 0%) id omou kedo ne descending formal 

461



Appendix H-1 

Speaker F3, Informal discourse, occurrence of ending forms for each 
proposition type 

Informant: f03a Age: 40s Discourse Type: informal group 


Information type A 
D kara informal : 2 
D n dakedo informal : 2 
D n da yo ne descending informal : 1 
D no descending informal : 5 
D no ne descending informal : 1 
D noun informal : 3 
D no yo informal : 1 
D wa yo informal : 1 
D yo formal : 1 
D yo informal : 1 
Information type B 
DQ ja nai ascending informal : 2 
D no ne descending informal : 1 
DQ quasi-q ending : 2 
DQ quasi-q intra : 1 
Information type C 
D ja nai descending informal : 2 
DQ quasi-q intra : 1 
D yo ne # informal : 1 
Information type D 
q direct ascending informal : 1 
q noun ascending informal : 2 
q no ascending informal : 6 
Information type E 
q ja nai no ascending informal : 1 
q kashira descending informal : 1 
Information type F 
D daroo descending informal : 1 
DQ ja nai ascending informal : 2 
D kara informal : 1 
D n dakedo informal : 1 
DQ n daroo ascending informal : 2 
DQ n ja nai ascending informal : 1 
D no descending informal : 4 
D noun informal : 1 
D no yo ne descending informal : 1 
DQ quasi-q intra : 1 
D yo ne # informal : 2 
id (da) tte informal : 2 
id -to kiita informal : 1 
462 


id mitai ja nai ascending informal : 
1
id n da tte informal : 
1
id omou informal : 
1
id rashii informal : 
1


Information type 
G
DQ daroo ascending informal : 
1
D (direct) informal : 
3
D kara informal : 
1
D n da mono descending informal : 
1
DQ n daroo ascending informal : 
2
D n da yo ne descending informal : 
1
D no descending informal : 
4
D no yo informal : 
2
DQ quasi-q ending : 
3
DQ quasi-q intra : 
2
D sa informal : 
2
D wake yo informal : 
1
id (da) tte informal : 
1
id n da tte informal : 
1
id rashii no descending informal : 
1
q -kke ascending informal : 
2
q direct ascending informal : 
2


Information type 
H
q n da -kke na descending informal : 
1
q n da -kke ascending : 
1


463



Appendix H-2 
Speaker F3, Formal discourse, occurrence of ending forms for each 

proposition type 

Informant: f03b Age: 40s 
Information type 
A
D (direct) formal 
D kara formal 
D kedo formal 
D n da formal 
D n dakedo formal 


Discourse Type: formalgroup 


: 37
: 
8
: 
3
: 
6
: 19


D n desu no ne descending formal : 
1
Information type 
B
q ja nai ka ascending formal : 
1
Information type 
C
D n dakedo formal : 
1
D ne descending formal : 
1
D yo ne # formal : 
2
id omou n da kedo formal : 
1
Information type 
D
q direct ascending formal : 
3
q ka ascending formal : 26
Information type 
E
DQ n da ne ascending formal : 
6
DQ n daroo ascending formal : 
1
D n da yo ne # formal : 
2
DQ n da yo ne ascending formal : 
3
q daroo ka descending formal : 
7
q n desu ka ascending formal : 
3
Information type 
F
AUX kamoshirenai node formal : 
1
id (da) soo (da) formal : 
3
id n (da) soo (da) ne ascending formal : 
1
id n da soo da formal : 
1
id omou kara formal : 
1
id omou kedo formal : 
1
id omou n da kedo formal : 
4
id rashii n da ga formal : 
2
id rashii n dakedo formal : 
1
id rashii n da ne ascending formal : 
3
Information type 
G
Information type 
H


464



Appendix H-3 

Speaker F3, Informal friend discourse, occurrence of ending forms by 
group for each proposition type. 

Info type: a Discourse type: friend name: f03a 
g1=16/18 (88%) g2=2/18 (11%) g3=0/18 (0%) g4=0/18 (0%) g5=0/18 (0%
) 
g6=0/18 (0%) g7=0/18 (0%) g8=0/18 (0%) g9=0/18 (0%) g10=0/18 (0%
) 


Info type: b Discourse type: friend name: f03a 
g1=0/6 (0%) g2=1/6 (16%) g3=0/6 (0%) g4=5/6 (83%) g5=0/6 (0%
) 
g6=0/6 (0%) g7=0/6 (0%) g8=0/6 (0%) g9=0/6 (0%) g10=0/6 (0%
) 


Info type: c Discourse type: friend name: f03a 
g1=0/4 (0%) g2=0/4 (0%) g3=2/4 (50%) g4=1/4 (25%) g5=1/4 (25%
) 
g6=0/4 (0%) g7=0/4 (0%) g8=0/4 (0%) g9=0/4 (0%) g10=0/4 (0%
) 


Info type: d Discourse type: friend name: f03a 
g1=0/9 (0%) g2=0/9 (0%) g3=0/9 (0%) g4=0/9 (0%) g5=0/9 (0%
) 
g6=9/9 (100%) g7=0/9 (0%) g8=0/9 (0%) g9=0/9 (0%) g10=0/9 (0%
) 


Info type: e Discourse type: friend name: f03a 
g1=0/2 (0%) g2=0/2 (0%) g3=0/2 (0%) g4=1/2 (50%) g5=0/2 (0%
) 
g6=1/2 (50%) g7=0/2 (0%) g8=0/2 (0%) g9=0/2 (0%) g10=0/2 (0%
) 


Info type: f Discourse type: friend name: f03a 
g1=7/24 (29%) g2=1/24 (4%) g3=1/24 (4%) g4=6/24 (25%) g5=2/24 (8%
) 
g6=0/24 (0%) g7=2/24 (8%) g8=4/24 (16%) g9=0/24 (0%) g10=1/24 (4%
) 


Info type: g Discourse type: friend name: f03a 
g1=14/30 (46%) g2=1/30 (3%) g3=0/30 (0%) g4=8/30 (26%) g5=0/30 (0%
) 
g6=4/30 (13%) g7=1/30 (3%) g8=2/30 (6%) g9=0/30 (0%) g10=0/30 (0%
) 


Info type: h Discourse type: friend name: f03a 
g1=0/2 (0%) g2=0/2 (0%) g3=0/2 (0%) g4=0/2 (0%) g5=0/2 (0%
) 
g6=2/2 (100%) g7=0/2 (0%) g8=0/2 (0%) g9=0/2 (0%) g10=0/2 (0%
) 


person: f03a All info types Total 
g1=37/95 (38%) g2=5/95 (5%) g3=3/95 (3%) g4=21/95 (22%) g5=3/95 (3%
) 
g6=16/95 (16%) g7=3/95 (3%) g8=6/95 (6%) g9=0/95 (0%) g10=1/95 (1%
) 


465



Appendix H-4 

Speaker F3, Formal discourse, occurrence of ending forms by group for 
each proposition type. 

Info type: a Discourse type: formal name: f03b 
g1=73/74 (98%) g2=1/74 (1%) g3=0/74 (0%) g4=0/74 (0%) g5=0/74 (0%
) 
g6=0/74 (0%) g7=0/74 (0%) g8=0/74 (0%) g9=0/74 (0%) g10=0/74 (0%
) 


Info type: b Discourse type: formal name: f03b 
g1=0/1 (0%) g2=0/1 (0%) g3=0/1 (0%) g4=1/1 (100%) g5=0/1 (0%
) 
g6=0/1 (0%) g7=0/1 (0%) g8=0/1 (0%) g9=0/1 (0%) g10=0/1 (0%
) 


Info type: c Discourse type: formal name: f03b 
g1=1/5 (20%) g2=1/5 (20%) g3=0/5 (0%) g4=0/5 (0%) g5=2/5 (40%
) 
g6=0/5 (0%) g7=0/5 (0%) g8=0/5 (0%) g9=0/5 (0%) g10=1/5 (20%
) 


Info type: d Discourse type: formal name: f03b 
g1=0/29 (0%) g2=0/29 (0%) g3=0/29 (0%) g4=0/29 (0%) g5=0/29 (0%
) 
g6=29/29 (100%) g7=0/29 (0%) g8=0/29 (0%) g9=0/29 (0%) g10=0/29 (0%
) 


Info type: e Discourse type: formal name: f03b 
g1=0/22 (0%) g2=0/22 (0%) g3=0/22 (0%) g4=10/22 (45%) g5=2/22 (9%
) 
g6=10/22 (45%) g7=0/22 (0%) g8=0/22 (0%) g9=0/22 (0%) g10=0/22 (0%
) 


Info type: f Discourse type: formal name: f03b 
g1=0/18 (0%) g2=0/18 (0%) g3=0/18 (0%) g4=0/18 (0%) g5=0/18 (0%
) 
g6=0/18 (0%) g7=6/18 (33%) g8=5/18 (27%) g9=1/18 (5%) g10=6/18 (33%
) 


person: f03b All info types Total 
g1=74/149 (49%) g2=2/149 (1%) g3=0/149 (0%) g4=11/149 (7%) g5=4/149 (2%
) 
g6=39/149 (26%) g7=6/149 (4%) g8=5/149 (3%) g9=1/149 (0%) g10=7/149 (4%
) 


466



Appendix I-1 

Speaker F5, Informal friend discourse, occurrence of ending forms by 
group for each proposition type. 

Info type: a Discourse type: friend name: f05a 
g1=23/25 (92%) g2=2/25 (8%) g3=0/25 (0%) g4=0/25 (0%) g5=0/25 (0%
) 
g6=0/25 (0%) g7=0/25 (0%) g8=0/25 (0%) g9=0/25 (0%) g10=0/25 (0%
) 


Info type: b Discourse type: friend name: f05a 
g1=0/4 (0%) g2=2/4 (50%) g3=2/4 (50%) g4=0/4 (0%) g5=0/4 (0%
) 
g6=0/4 (0%) g7=0/4 (0%) g8=0/4 (0%) g9=0/4 (0%) g10=0/4 (0%
) 


Info type: c Discourse type: friend name: f05a 
g1=2/38 (5%) g2=13/38 (34%) g3=0/38 (0%) g4=5/38 (13%) g5=13/38 (34%
) 
g6=2/38 (5%) g7=0/38 (0%) g8=0/38 (0%) g9=3/38 (7%) g10=0/38 (0%
) 


Info type: d Discourse type: friend name: f05a 
g1=0/64 (0%) g2=3/64 (4%) g3=0/64 (0%) g4=0/64 (0%) g5=0/64 (0%
) 
g6=60/64 (93%) g7=0/64 (0%) g8=0/64 (0%) g9=0/64 (0%) g10=1/64 (1%
) 


Info type: e Discourse type: friend name: f05a 
g1=0/25 (0%) g2=0/25 (0%) g3=3/25 (12%) g4=19/25 (76%) g5=0/25 (0%
) 
g6=3/25 (12%) g7=0/25 (0%) g8=0/25 (0%) g9=0/25 (0%) g10=0/25 (0%
) 


Info type: f Discourse type: friend name: f05a 
g1=1/10 (10%) g2=1/10 (10%) g3=1/10 (10%) g4=2/10 (20%) g5=1/10 (10%
) 
g6=2/10 (20%) g7=0/10 (0%) g8=0/10 (0%) g9=2/10 (20%) g10=0/10 (0%
) 


Info type: g Discourse type: friend name: f05a 
g1=3/31 (9%) g2=0/31 (0%) g3=0/31 (0%) g4=7/31 (22%) g5=0/31 (0%
) 
g6=12/31 (38%) g7=2/31 (6%) g8=7/31 (22%) g9=0/31 (0%) g10=0/31 (0%
) 


Info type: h Discourse type: friend name: f05a 
g1=11/18 (61%) g2=0/18 (0%) g3=0/18 (0%) g4=0/18 (0%) g5=0/18 (0%
) 
g6=7/18 (38%) g7=0/18 (0%) g8=0/18 (0%) g9=0/18 (0%) g10=0/18 (0%
) 


person: f05a All info types Total 
g1=40/215 (18%) g2=21/215 (9%) g3=6/215 (2%) g4=33/215 (15%) g5=14/215 (6%
) 
g6=86/215 (40%) g7=2/215 (0%) g8=7/215 (3%) g9=5/215 (2%) g10=1/215 (0%
) 


467



Appendix I-2 

Speaker F5, Formal conversation discourse, occurrence of ending forms 
by group for each proposition type. 

Info type: a Discourse type: formal name: f05b 
g1=85/129 (65%) g2=38/129 (29%) g3=1/129 (0%) g4=1/129 (0%) g5=0/129 (0%
) 
g6=1/129 (0%) g7=0/129 (0%) g8=0/129 (0%) g9=0/129 (0%) g10=3/129 (2%
) 


Info type: b Discourse type: formal name: f05b 
g1=1/8 (12%) g2=1/8 (12%) g3=1/8 (12%) g4=3/8 (37%) g5=1/8 (12%
) 
g6=1/8 (12%) g7=0/8 (0%) g8=0/8 (0%) g9=0/8 (0%) g10=0/8 (0%
) 


Info type: c Discourse type: formal name: f05b 
g1=1/64 (1%) g2=0/64 (0%) g3=1/64 (1%) g4=10/64 (15%) g5=47/64 (73%
) 
g6=3/64 (4%) g7=0/64 (0%) g8=0/64 (0%) g9=0/64 (0%) g10=2/64 (3%
) 


Info type: d Discourse type: formal name: f05b 
g1=0/57 (0%) g2=1/57 (1%) g3=0/57 (0%) g4=1/57 (1%) g5=0/57 (0%
) 
g6=55/57 (96%) g7=0/57 (0%) g8=0/57 (0%) g9=0/57 (0%) g10=0/57 (0%
) 


Info type: e Discourse type: formal name: f05b 
g1=0/32 (0%) g2=0/32 (0%) g3=2/32 (6%) g4=20/32 (62%) g5=3/32 (9%
) 
g6=4/32 (12%) g7=0/32 (0%) g8=0/32 (0%) g9=3/32 (9%) g10=0/32 (0%
) 


Info type: f Discourse type: formal name: f05b 
g1=3/24 (12%) g2=2/24 (8%) g3=0/24 (0%) g4=2/24 (8%) g5=2/24 (8%
) 
g6=2/24 (8%) g7=1/24 (4%) g8=1/24 (4%) g9=8/24 (33%) g10=3/24 (12%
) 


Info type: g Discourse type: formal name: f05b 
g1=0/11 (0%) g2=0/11 (0%) g3=0/11 (0%) g4=3/11 (27%) g5=1/11 (9%
) 
g6=3/11 (27%) g7=4/11 (36%) g8=0/11 (0%) g9=0/11 (0%) g10=0/11 (0%
) 


Info type: h Discourse type: formal name: f05b 
g1=27/31 (87%) g2=2/31 (6%) g3=0/31 (0%) g4=0/31 (0%) g5=0/31 (0%
) 
g6=1/31 (3%) g7=0/31 (0%) g8=1/31 (3%) g9=0/31 (0%) g10=0/31 (0%
) 


person: f05b All info types Total 
g1=117/356 (32%) g2=44/356 (12%) g3=5/356 (1%) g4=40/356 (11%) g5=54/356 (15%
) 
g6=70/356 (19%) g7=5/356 (1%) g8=2/356 (0%) g9=11/356 (3%) g10=8/356 (2%
) 


468



Appendix I-3 

Speaker F5, Informal family discourse, occurrence of ending forms by 
group for each proposition type. 

Info type: a Discourse type: family name: f05c 
g1=57/69 (82%) g2=5/69 (7%) g3=0/69 (0%) g4=0/69 (0%) g5=0/69 (0%
) 
g6=3/69 (4%) g7=0/69 (0%) g8=1/69 (1%) g9=0/69 (0%) g10=3/69 (4%
) 


Info type: b Discourse type: family name: f05c 
g1=1/12 (8%) g2=2/12 (16%) g3=5/12 (41%) g4=3/12 (25%) g5=0/12 (0%
) 
g6=1/12 (8%) g7=0/12 (0%) g8=0/12 (0%) g9=0/12 (0%) g10=0/12 (0%
) 


Info type: c Discourse type: family name: f05c 
g1=6/45 (13%) g2=6/45 (13%) g3=0/45 (0%) g4=19/45 (42%) g5=13/45 (28%
) 
g6=1/45 (2%) g7=0/45 (0%) g8=0/45 (0%) g9=0/45 (0%) g10=0/45 (0%
) 


Info type: d Discourse type: family name: f05c 
g1=0/92 (0%) g2=0/92 (0%) g3=0/92 (0%) g4=3/92 (3%) g5=0/92 (0%
) 
g6=88/92 (95%) g7=0/92 (0%) g8=0/92 (0%) g9=1/92 (1%) g10=0/92 (0%
) 


Info type: e Discourse type: family name: f05c 
g1=1/21 (4%) g2=1/21 (4%) g3=4/21 (19%) g4=6/21 (28%) g5=4/21 (19%
) 
g6=4/21 (19%) g7=0/21 (0%) g8=0/21 (0%) g9=1/21 (4%) g10=0/21 (0%
) 


Info type: f Discourse type: family name: f05c 
g1=0/16 (0%) g2=3/16 (18%) g3=0/16 (0%) g4=4/16 (25%) g5=3/16 (18%
) 
g6=2/16 (12%) g7=1/16 (6%) g8=2/16 (12%) g9=1/16 (6%) g10=0/16 (0%
) 


Info type: g Discourse type: family name: f05c 
g1=2/10 (20%) g2=0/10 (0%) g3=0/10 (0%) g4=5/10 (50%) g5=0/10 (0%
) 
g6=3/10 (30%) g7=0/10 (0%) g8=0/10 (0%) g9=0/10 (0%) g10=0/10 (0%
) 


Info type: h Discourse type: family name: f05c 
g1=2/14 (14%) g2=6/14 (42%) g3=0/14 (0%) g4=0/14 (0%) g5=0/14 (0%
) 
g6=6/14 (42%) g7=0/14 (0%) g8=0/14 (0%) g9=0/14 (0%) g10=0/14 (0%
) 


person: f05c All info types Total 
g1=69/279 (24%) g2=23/279 (8%) g3=9/279 (3%) g4=40/279 (14%) g5=20/279 (7%
) 
g6=108/279 (38%) g7=1/279 (0%) g8=3/279 (1%) g9=3/279 (1%) g10=3/279 (1%
) 


469



Appendix J 
Speaker M16, occurrence of ending forms for each proposition type. 

Informant: m16 Age: 30s Discourse Type: formal 


Information type 
A


D (direct) formal 


D (direct) informal 


D kedo informal 


D kedo ne descending formal 


D no descending informal 


D wa informal 


D yo informal 
Information type 
B


D ja nai descending informal 


DQ n da yo ne ascending formal 
Information type 
C


DQ ja nai ascending informal 


DQ n da yo ne ascending formal 


D ne # informal 


DQ ne ascending informal 


D yo ne # informal 
Information type 
D


DQ ne ascending informal 


q daroo ka descending informal 


q direct ascending informal 


q noun ascending informal 


q no ascending informal 
Information type 
E


DQ daroo ascending informal 


q -kke ascending informal 


q ja nai no ascending informal 
Information type 
F


: 
1
: 
4
: 
1
: 
1
: 
3
: 
1
: 
4


: 
1
: 
1


: 
1
: 
1
: 
2
: 
2
: 
1


: 
1
: 
1
: 
2
: 
2
: 
5


: 
1
: 
1
: 
1


AUX conjecture daroo ne descending informal : 
1
DQ daroo ascending informal : 
2
D (direct) informal : 
3
D kedo informal : 
1
D n da yo ne # formal : 
2
DQ n da yo ne ascending formal : 
2
D ne descending informal : 
2
DQ ne ascending informal : 
2
D no descending informal : 
2
D wake informal : 
1
id (da) soo (da) informal : 
1
id (da) tte informal : 
4
id -to kiita no descending informal : 
1


470 


id n da tte informal : 
1
q n ja nai no ascending informal : 
1
Information type 
G
DQ daroo ascending informal : 
2
D (direct) informal : 
1
D no descending informal : 
1
DQ no ne ascending informal : 
1
D no yo informal : 
1
D yo ne # informal : 
1
id (da) tte informal : 
2
Information type 
H


471