THE NEW WORDS KIDS HEAR FROM TRANSLATED PICTUREBOOKS

This study shows how the language in translated picturebooks is enriched by the use of rare words. We document how the translation of picturebooks from English to Portuguese results in the use of rare words in Portuguese. Evidence indicates that children learn new vocabulary through readings of picturebooks (Noble et al., 2019) and that translators make choices that contribute to the use of rare words (Ketola, 2018). The sample of 86 picturebooks was selected from a list recommended by the Portuguese national reading plan for 3-5-year-olds. The identification of rare words was done using a frequency analysis in both Portuguese, using ESCOLEX, and English, using the ChildFreq tool. Findings indicate that translated picturebooks use rich and varied lexicon and include an average of 6.6 rare words. Twenty-two percent of these words originate from literal and non-literal translations and are not rare in the original texts. This indicates that the process of translation contributes to increasing children's exposure to rare words.


INTRODUCTION
Research indicates that reading stories to children contributes to their language development and makes them better readers in primary school (Shahaeian et al., 2018). It has been amply demonstrated that written language is more complex than oral language and that reading to young children should be a common practice because it has a direct impact on their language development (Golinkoff et al. 2019;Grolig, 2020). Studies show that books offer language input that is more complex than oral language and that children learn new words by listening to written stories (Dickinson et al., 2012;Elley, 1989;Penno et al., 2002). Books contain more lexical diversity than oral speech  and parents use more low frequency words when they read to their children than when they engage in other activities with them (Noble et al., 2019).
Children's books, namely picturebooks, are a medium that supports shared reading interactions between adults and children and play a crucial role in the interactional triad-child, adult and book (Grolig, 2020). Picturebooks are intended for young children, relying on the use of pictures to tell or complement the written story. They contain specific text-picture characteristics that facilitate talk about the text and the visual elements present (Breit-Smith et al., 2017). This talk often includes parents asking questions and asking children to point to the illustrations, which results in the learning of the words the illustrations depict (Sénéchal et al., 1995).
In the field of translation studies, recent studies have focused on picturebooks as a medium that invites transcreation, as the process of adapting the material for a new linguistic and cultural context (Ketola, 2018). It has been observed that the process of translating picturebooks "call[s] for extensive adaptations of verbal material in order to create multimodally coherent products for a new target audience" (p. 127). This means that translation is merged with (re)creation and, as a consequence, the resulting text may differ from the original one in terms of linguistic features.
Given the evidence that picturebooks, through the new linguistic input they provide, contribute to children's language development and specifically to vocabulary learning, we explore how translated picturebooks, recommended for 3-5-year-olds, may play a relevant role in this contribution. Specifically, our study analyzes how translated picturebooks-English-Portuguese-result in the use of rare, low-frequency words in Portuguese. The purpose is two-fold. First, to identify the rare words used in translated picturebooks. Second, to offer an interpretation of the process of transcreation that results in the use of rare words in the target language, which may or may not be a translation of high-frequency words in the source language. To our knowledge, this is the first study to address the role of translated picturebooks in providing new vocabulary for early language learning. As such, it brings new insights about how translated picturebooks offer opportunities for children to be exposed to new vocabulary and can inform research that looks at the lexical quality of translated picturebooks and its relationship with children's language development.

REVIEW OF THE LITERATURE
Shared reading, the act of reading a book to a child and discussing it (Zucker et al., 2012), is one type of language experience that toddlers, preschoolers and kindergarten children should be exposed to because it will make them better readers (Dickinson et al., 2012). Several longitudinal studies corroborate this notion that reading comprehension in the primary grades is largely dependent on early language experiences acquired during shared reading (Dickinson et al., 2012). This type of reading can be characterized as a conversation about the printed material during which parents or caregivers ask about experiences children had that are related to the story, ask them to name pictures and to explain situations (Whitehurst et al., 1988;Whitehurst & Lonigen, 2001).
Shared reading experiences with young children may occur in the home environment, as a part of the Home Literacy Environment (HLE) and as part of the Child Care Literacy Environment (CCLE), and evidence indicates that these experiences contribute to better reading performance in primary school (Grolig, 2020). Regarding the HLE, several studies show that frequent storybook reading in the home promotes preschool children's vocabulary acquisition, morphological and syntactic comprehension (Sénechal et al., 2008). Moreover, it is positively associated with an increase in children's reading performance in grade four, as measured by standardized tests. This evidence has been gathered in studies that investigate parental book reading in Canada (Sénéchal & Young, 2008), in Europe (Araújo & Costa, 2015), in the United States (Mol & Bus, 2011;Mol et al., 2008;Whitehurst & Lonigan, 2001) and Australia (Kalb & van Ours, 2014). For instance, Sénéchal´s studies with Canadian children show that "parents' reports of shared reading were a robust predictor of children's receptive and expressive vocabulary" in grade four (Sénéchal, 2011, p. 179).
Evidence gathered in the CCLE supports these findings. For example, Dickinson and Porche (2011) found that fourth grade vocabulary knowledge was related to shared reading experiences in preschool and kindergarten classrooms. Furthermore, the results of several studies suggest that what matters is not only the frequency of shared reading, but the kind of talk about books that teachers engage in with children (Zucker et al., 2012). Children learn more vocabulary when teachers use an interactional style of reading that includes explanations of word meanings and use inferential comments and questions (Beck & McKeown, 2007;NELP, 2008). This finding has important educational implications because vocabulary knowledge at the age of 5, in particular, is one of the strongest predictors of children's ability to learn to read (Durham et al., 2007;Snow et al., 1998). In fact, research indicates that direct teaching of word meanings during shared reading enhances the learning of the meaning of words, both in the CCLE and in home contexts (Biemiller, 2006).
Shared reading contributes to increasing children's vocabulary knowledge as well as to their acquisition of complex language knowledge (e.g., comprehension of long, grammatically complex sentences) and this predicts later reading ability in elementary school. As Dickinson et al. (2012) contend "Children learn new vocabulary through grammar and grammar through vocabulary" (p. 5). That is, the way sentences are constructed offers children clues as to whether a word is a verb or an adjective, for example.

Picturebooks as a medium for language learning
Picturebooks are unique in that they establish a link between pictures and text and are designed to entertain young readers (Massaro, 2015). This harmony in picturebooks constitute a semiotic whole, since the interaction between the verbal and the visual systems is a "conditio sine qua non for the construction of the narrative meaning and for the fruition and enjoyment of the genre" (Sezzi, 2020, p. 216). The pictures resemble objects or scenes and complement or even substitute part of the narrative text. When listening to picturebooks, four-year-old children have been observed to fixate the pictures 95% of the time (Evans & Saint-Aubin, 2005). We don't know much about the type of picturebooks children are exposed to in the HLE and in the CCLE. One indication from a survey by Hudson Kam and Matthewson's (2017) is that there is wide variability in household selection of picturebooks and that in the HLE a child will typically hear about 10 picturebooks in one month during shared reading experiences (Bradley et al., 2001;Young et al., 1998). Moreover, research indicates that the language input children receive when listening to speech from text may account for 3 to 10 percent of all speech children hear in one day, with that percentage varying according to how often a child is read to (e.g., twice daily, once a day, or less) (Shneidman et al., 2013;Weisleder & Fernald, 2014).
Picturebooks constitute a medium through which children not only hear rare words or more novel vocabulary during shared reading, but also more complex sentences, including passive sentences and sentences containing relative clauses (Cameron-Faulkner & Noble, 2013;. Furthermore, children's exposure to books predicts their spoken production of complex sentences in eightyear-olds (Montag & MacDonald, 2015). Studies show that when parents read picturebooks to their children they tend to stick to print and read the text, although have been observed to engage in book-related talk beyond the text itself, (Montag, 2019). In short, picturebooks offer varied vocabulary and complex language input, which supports later reading achievement (Stanovitch, 2000), because adults attend to, and possibly expand, the printed texts during shared reading interactions (Grolig, 2020).
Importantly, research indicates that picturebooks include more diverse vocabulary than adults' spoken language (Cunningham & Stanovitch, 1997). Tabors (2001) andHays (1988) have shown that picturebooks have more low-frequency words than spoken words in everyday conversations. Hayes and Ahrens (1988) calculated that the percentage of rare words in children's books is 30.9, compared to 9.9 rare words in the speech of adults talking to children, commonly referred to as Child-Directed Speech (CDS). More recently, Massaro (2015) and Montag (2019) found, by comparing it to CDS, that picturebooks contain almost three times more rare words than CDS. Montag, Jones and Smith´s (2015) study of the new words that appear in picturebooks for children aged 0-60 months also corroborates this finding.
CDS does not include the same diversity of vocabulary and syntax as print, although it can also support children's language learning. Studies show that preschool, kindergarten and second grade children have larger vocabularies when their parents use a high proportion of rare words in CDS, regardless of the activity they engage in with their children (Golinkoff et al., 2018;Weizman & Snow, 2001). The more rare words parents use when talking to their children the higher they score on tests of receptive vocabulary knowledge, namely on the Peabody Picture Vocabulary Test (Rowe, 2012). Thus, CDS can also be a source of new vocabulary learning for children and parents have been observed to use between about 2-6% of rare words when talking to their children (Rowe, 2012;Weizman & Snow, 2001). Nonetheless, listening to stories increases children's exposure to more sophisticated language, rare words in particular, and this promotes their language development (Golinkoff et al., 2018;McLeod & McDade, 2011;. Taken together, these findings suggest that the text in picturebooks is an important source of vocabulary learning for young children and we know that "exposure to vocabulary is particularly likely to have beneficial effects when the input includes a high density of novel words relative to total words" (Dickinson et al., 2012, p. 4). Moreover, research indicates that the book-related talk that adults use during shared reading, namely the explanation of word meaning, is likely to contribute to children's language development.

Translated picturebooks
Translations constitute a very significant percentage of all picturebooks published in Portugal, especially considering the most popular ones, in both school and home environments. Translated picturebooks proliferate among editions released by the most successful and prestigious publishers, and also among the list of books recommended by the National Reading Plan (known as PNL). This may not be the case in other countries, but the situation in Portugal follows the general trend, whereby for "small languages, such as Czech, translations constitute a substantial part of the canon of children's literature" (Čermáková, 2018, p. 118), whereas in English-speaking nations the situation is reversed: "children's book production in the UK yearly involves only about 2% of translations" (Čermáková, 2018, p.118).
It is widely acknowledged that children's literature constitutes crossover fiction, in the sense that adults are frequent co-readers, mediators and sometimes performers of the story (Sezzi, 2009;Spitz, 1999). Still, the main and final audience are children, and this implies that the choice of words in the translation of picturebooks is influenced by the desire to address the characteristics and needs of children, or what translators assume these characteristics and needs to be (Ketola, 2018). Thus, translators generally aim to achieve a version of the text whose language is simple and clear, so as to ensure that the young reader understands it (Thompson & Sealey 2007). The accepted principles of translation in children's literature-simplification, explicitation and normalization (Baker, 1996)-account for the tendency to use common, already known words to ensure comprehension. Simplification means literally "to simplify the language used in translation" (Baker, 1996, pp. 181-182), explicitation is "to spell things out rather than leave them implicit" (Baker, 1996, pp. 181-182), and normalization involves conforming "to patterns and practices which are typical of the target language, even to the point of exaggerating them" (Baker, 1996, pp. 176-177).
However, it should also be acknowledged that this intention is compatible with the choice of vocabulary which is not commonly used with or by children in translated children's books, whenever these words are thought to make meaning clearer or more accurate (for example if the rare word stands for a particular animal species)-often in harmony with the illustrations. Moreover, the purpose of ensuring that the text is easy to understand by children is not prioritized to the point of favouring repetition over novelty, in terms of vocabulary. On the contrary, it has been noted that repetition of words is generally deemed undesirable by writers and translators alike, and children's literature is no exception. This is evident, for example, in the number of verbs chosen to replace the English verb said in translated children's books, whenever reported speech is used (Corness, 2009Fárová, 2016Nádvorníková, 2017).
The need to preserve rhyme is another reason that accounts for the use of rare words in translated picturebooks (Ketola, 2018). In order to reproduce the harmonious musical qualities of the source language, translators often resort to vocabulary that is unfamiliar to children. If their choice implies a shift from the semantic or stylistic quality of the original text, the process is called modulation (van Leuven-Zwart, 1989, pp. 159-169). If translators reach the point of inserting words in the version they are (re)creating that have no counterpart in the original text, this shift is known as mutation (van Leuven-Zwart, 1989, pp. 159-169).
When the use of rare words in picture books derives from a felt need to add content to the text which cannot be attributed to the intention of clarifying meaning or make the underlying message more graspable, this might be called manipulation (Shavit, 1981). In that case, the translator is trying to make the message more complete or "acceptable" from an ethical or pedagogical point of view. For some scholars, this opens a discussion as to whether these adjustments are acceptable or unnecessary (Klingberg, 1986;Toro, 2020). As Oittinen points out, adaptation is not in itself a negative strategy and it may be argued that all translations entail some degree of adaptation, as they are, inevitably, a transformation of a product into another (Oittinen, 2000).
Our interest in the presence of rare words in translated picture books is anchored in the evidence that reading texts with new, unknown vocabulary is beneficial, as it contributes to children's language development. Thus, in the present study, we investigate whether the translated texts offer opportunities for children to be exposed to novel or rare words. These words constitute low-frequency vocabulary that is not likely found in texts directed at children in the 3-5 age range.
In general, to understand how picturebooks can promote young children's language development, we have to understand the contribution of multiple factors-how the book text, extra-text talk, and the pictures all contribute to the learning environment (Montag, 2019). Specifically, the present work provides information about how one of these factors -the text in translated picturebooksmay contribute to the language learning environment.

METHODOLOGY
In order to understand to what extent and how rare words appear in picturebooks translated from English to Portuguese, we identify the rare words and describe the options translators choose that result in their use. First, for identification purposes we conducted a frequency word count. Second, we conducted a qualitative analysis of the options made by translators in order to describe how the strategies they employ result in the use of rare words, or low frequency words in the target language. Specifically, our study addresses the following questions: 1) To what extent are rare words used in Portuguese translations of English picturebooks? 2) What is the correspondence of rare words in the target language with those in the source language? 3) What translation options are used and how do they result in the use of rare words?

Data collection
First, we selected all picturebooks translated from English to Portuguese, a total of 86, from a list of 167 books (fiction only) recommended by the Portuguese National Reading Plan between 2017 and 2020 for 3-5-year-olds. The National Reading Plan, known as PNL, is an initiative of the Portuguese Government intended to constitute an institutional response to the concern of the literacy level of the general population, and of the youngest in particular, which is significantly inferior to the European average (Decreto Lei no 64/2006). It is materialized in a set of defined strategies to promote the development of reading and writing skills, as well as the widening and deepening of reading habits, among the school population (Decreto Lei no 64/2006).
The listings of works suggested by the PNL, published in its website, include varied themes and are intended for a diverse audience, such as children, youngsters and adults. The books listed result from a first selection by publishing companies who suggest their inclusion in PNL age-specific lists and are subsequently evaluated by independent specialists of recognized merit and qualification in the field of literacy/children's literature.
The list of 167 books is composed mainly of translated works, 74%, of which 51% have English as the source language. Forty-four picturebooks, or 26% of the sample, are written by Portuguese authors.
Paper copies of the translated picturebooks were gathered from PNL headquarters. The original works in English were obtained by checking available written and/or read-aloud electronically versions. This list of 86 English picturebooks includes several well-known British and American authors, such as Beatrix Potter, Benji Davis, David McKee, Eric Carle, Jen Campbell, Kirsten Hall, and Roger Hargreaves. The linguistic corpus of these picturebooks was considered as follows: 1) total number of words in each picturebook, 2) all words that could be considered rare words in Portuguese and their translation equivalents in the source language. We excluded informational picturebooks, because this genre includes scientific terms and thus includes many more rare words than fiction. As Massaro (2015) notes, the presence of new words is expected to be larger in informational books.

Data analysis
In order to contextualize the linguistic corpus of the study, we categorized the collected 86 picturebooks translated from English to Portuguese in terms of their thematic focus and total number of words. In the categorization of themes, we considered the categories used in national school libraries and the books' synopses available in the publishers' site. From these, at times too general (animals, adventure, emotions) or too specific (books, grandparents and tenderness, mischief caused by a tiger) we created larger, more consistent categories.
In addressing the first research question, we considered all words present in the translated picturebooks that might constitute rare words for 3-5-year-olds. To ascertain whether the words identified as potential rare words could be considered as such we conducted a word frequency quantitative analysis using ESCOLEX. This database uses a Portuguese word corpus based on published school manuals from first through 6th grades (Soares et al., 2014). A search in ESCOLEX to determine the frequency with which words appeared in first grade manuals constituted our corpus search method, given that this constitutes the typical printed material Portuguese children will encounter once they move from early childhood education to grade school. For this grade level, ESCOLEX includes a total of 8.313 words collected from 25 textbooks (Soares et al., 2014). In the identification of rare words we counted words of the same lexical family, following the rationale offered by Nagy & Anderson (1984) that the meaning of the new word can be determined by using knowledge of its root. For example, if no instances of a verb in the past tense were found in Escolex but the same verb was found in the present tense we counted its frequency and included it in the analyses.
We excluded acronyms, idiomatic expressions, and rare words that coincided with common words in graphic/phonemic terms (homonyms such as parada, which means parade, but can also be the past participle of the verb to stop (stopped). We established the threshold of 1 occurrence in Escolex to consider a word a rare one. High-frequency words are typically defined as having more than 100 occurrences per 1 million words (Brysbaert et al., 2018). Since the Escolex dataset is much smaller and includes only 8,313 words in the grade one textbook-based linguistic corpus we use, the threshold of 1 is equivalent to slightly less than 100 occurrences per 1 million words.
In order to understand if the rare words in the target language correspond to rare ones in the source language, we looked for all translation counterparts in the original English picturebooks. We counted the number of Portuguese rare words that did not have a translation counterpart and considered those that had a translation counterpart that was a high-frequency word in English. In order to determine the frequency of the latter, we used the tool ChildFreq (Bååth, 2010). The corpora in ChildFreq is taken from the English part of CHILDES (MacWhinney, 2000), which includes both British and American transcripts of talk between parents and their children. The ChildFreq tool comprises a total of 3.5 million word tokens, and children included in the sampled transcripts range from six months to seven years of age, with most children being three years old (Bååth, 2010). We used the age interval 3-5 years of age, with a corresponding word corpus of 1,000,000 words to ascertain if the words identified as rare words in Portuguese were also rare words in the source text in English. In checking words in this online child frequency tool, we considered individual words and, when the Portuguese target corresponded to more than one English word or expression we searched the frequency of each word (e.g., sand-bank, lived in fear; mountain climber).
Regarding our interest in vocabulary choices made by translators, we found it relevant to assess whether the new/rare words in the target language were necessary or not, considering their counterpart in the original text and possible alternatives to translate them. We were also interested in the possible motives that might explain why a translator chose to use a new or rare word instead of a common, known one, regardless of the frequency of the counterpart word in the original text.

RESULTS
The 86 books analyzed fall into five categories: General knowledge, Life experiences and relationships, Humor, Adventure, and Fantasy. A book about planet Earth, animals, food, numbers, travels or jobs falls in the General knowledge category; a book about friendship, personal characteristics, manners, dealing with emotions and interacting with others falls into the Life experience and relationships category; a book mainly intended to make children laugh about funny characters and situations falls into the Humour category; a book whose characters venture out of their comfort zones and into unknown environments, being scared and or surprised by novelty, falls into the category of Adventure; a book whose characters, settings and events are predominantly other-wordly, inviting the child to conceive an alternative reality where anything is possible, falls into the Fantasy category. Forty-six books are mainly about Life experience and relationships; 12 books focus on General knowledge; 4 books are dedicated to Humor, while no books were solely about Adventure or Fantasy. In many cases (28 titles), the picturebooks combine 2 or more themes, such as General knowledge & life experience (7 books The average book length in our sample is 540 words per book. Analyses show that, of the 86 picturebooks sampled, only 7 books, or 9% of the sample, did not contain any rare words. Our selection of potential rare words in the 79 remaining picturebooks yielded 629 words and our findings indicate that the overwhelming majority of the selected words, 565, are rare, because they never occur in the Escolex linguistic corpus or appear only one time. Most rare words have a frequency of zero in Escolex, while only 79 out of the 565 appear one time. In the 86 picturebooks translated from English to Portuguese we find, on average, 6.6 rare words. The mode is two rare words and the median is 4.5. The fact that the average is larger than the median reveals a skewness in the data, clearly displayed in Figure 2. The variation in the sample (standard deviation) is considerable (Table 1). There are very few repetitions of rare words. Our sample is composed, almost exclusively, of unique types of rare words. More specifically, out of the 565 rare words identified, only 26 words appear more than once in different picturebooks. For example, the verb sussurrar, the nouns alvoroço and fôlego, appear two times and only the preposition sob for under appears four times, both word repetitions that occur in different books.
The number of rare words per picturebook varies, as Figure 2 shows, with most books having between 0-3 rare words. One book in particular, has almost 60 rare words, while another that stands out has 32 words. In the great majority of books the number of rare words varies between 0-15 words. Figure 1 shows the rank distribution of rare words per book. It is evident that there is a Pareto-type inverse relationship: as the number of rare words increases the number of books that have them decreases. This Pareto-type distribution is well known in linguistics through the work of George Zipf (1935Zipf ( , 1949 and other researchers. It describes phenomena for which rare occurrences are more frequent than what is assumed by a normal distribution and other types of common probability distributions. Data here are not enough to conclusively indicate a particular probabilistic distribution or make a clear statement about the characteristic of the observed tail behaviour. At any rate, the following log-log plot of the tails for observed rare words suggests a heavy-tailed behaviour. In fact, instead of dropping abruptly as in a normal distribution, the almost linear loglog decrease suggests this statistical behaviour (see, e.g., Samorodnitsky & Taqqu, 1994).
Fifty-one, or 9% of the rare words that occur in Portuguese do not have a source equivalent in English. That is, the translator resorted to mutation. Of the 514 rare words that do have a translation counterpart in the original text, 76, or 13% are rare words in Portuguese but are not rare, or low-frequency ones in the original English texts. Table 2 summarizes these findings.  This distribution indicates that 22% (9% + 13%) of rare words, or about 1.4 rare words per picturebook, were either created by the translator or had as a translation equivalent a frequent word in English. These two translation processes correspond to mutation and modulation, respectively. Appendix 1 shows the total number of words, their frequency in Escolex, their translation equivalent, whether the words are cognates, and associated translation process. Our analyses show that the use of cognate words is associated with translation, not with mutation or modulation, and that it corresponds to 18% of the total number of rare words (565) in the corpus. We find that translators often choose to use new words that are rare words in the target language due to the need to make words rhyme as in the original text. This requirement sometimes even makes translators resort to words that have no counterpart in the original text (mutation), thus adding or changing the meaning of the text for the sake of ensuring terminal sound identity between words at the end of each verse or line. This accounts for the use of rare words such as adejar, celestial, a definhar, detalhado, euforia and triunfante, among others (most of which have no counterpart in the original text). These words seem to not only serve the most immediate purpose of rhyming with another, more mundane, word used previously in the text, but also the purpose of lending it a poetic quality.
The intention of rendering the text more "literary" (as well as of avoiding repetition) sometimes seems to explain less literal choices made by translators, as they employ rich and varied vocabulary that has no counterpart in the original text. We found many examples of rare words whose source in English is a common word. Translators seem to have intentionally refrained from choosing the most literal equivalent in order to make written language more diversified and sophisticated than speech. This is the case with aconchegam-se, a desabar, estacaram, garridas and bramiu, for example, which are used to translate amass, falling, stared, so bright and said, respectively (their literal equivalents would be "juntam-se", "a cair", "pararam" / "olharam", "tão vivas'' and "disse", respectively).
In some instances, it seems that translators intentionally wish to offer young readers the opportunity to learn a new word while they are enjoying the story. That seems to be the only valid reason to explain why, for example, the sentence «Eventually, we stopped wandering» (Bunting, 2019) becomes «Antes, nunca parávamos no mesmo sítio, éramos nómadas», where the word nómadas (nomads) is provided as extra lexical information in the translation.
Also, the translated picturebooks present a lot of new vocabulary to children because, on many occasions, translators choose not to use the literal equivalents to common original words. This is most obvious in the case of the verb said, as documented in the literature (Corness, 2009Fárová, 2016Nádvorníková, 2017). Through the process of modulation, it turns into a number of different verbs, be it to use a verb that refers to the specific sounds that the different species of animals make when they "speak" (thus "teaching" specific vocabulary as the story is told, as with relinchou, bramiu, mugiu, to translate said), be it to convey a clearer idea of the attitude or intention with which animals speak (for example, through the use of tranquilizou-o, advertiram, desabafou and rezingou to translate said). However, this also happens with other verbs, for example the verb to "eat", which is sometimes translated as "devorar" (to devour), instead of the immediate equivalent comer (eat), possibly to cause a more striking impression in terms of meaning (as it refers to a stronger, more intense kind of eating), as well as to conform the text to the illustrations and the imagery suggested by the characters and the plot (a starving caterpillar or a scary monster, for example).
The translated picturebooks we analyzed also present rare words whose counterpart in the original text is a common word due to grammar differences, especially when it comes to verb tenses. We found several situations where the future tense results in a rare word, because in spoken Portuguese proper future conjugation (e.g., ouvirei) is replaced with a periphrastic combination of auxiliary verb in present tense + infinitive of the main verb (e.g., vou ouvir). This means that whenever a child is confronted with a verb that is conjugated in the future, he/she is faced with a rare word, a fact that is aggravated when there is a pronoun in the middle of the verb word, which in spoken language practically never happens. So, concedê-lo-ei and enfeitar-se-á are rare words, whereas their original counterparts glad to grant it and will decorate are not. But it should be added that the future is not the only tense where this happens, as there are other conjugations in Portuguese that do not exist in English. This explains why common forms like kept and thought are translated as the rare words mantiveram and tencionasse, respectively. Moreover, grammatical reasons behind the use of rare words in translations are not always related to verbs. They can also derive from the need or choice to use adjectives, for example, rather than nouns, as is the case with river beach and had feathers, which result in praia fluvial and emplumadas, respectively.
Another motive for translators to opt for rare words might be the desire to make the text funny. This happens, for example, in the translation of Do Not Open This Book (Lee, 2018), where the English words wow, gold, right and something awful are not literally translated as "uau", "ouro/dinheiro", "bom / muito bem" "uma coisa horrível", but rather as arre, guito, porreiro and cataclismo, respectively, all of which are uncommon words for children. It appears that this choice is meant to make the character sound comically idiosyncratic, as they are informal but mostly outdated (the kind of slang that their parents or grandparents would use).
Finally, it should be noted that, in some cases, it simply happens that the literal translation of a concept is a rare word, even though in the source language it might be a common word. This is the case, for example, with frame, from, puff, sign and slid, which in Portuguese result in estrutura, sob, baforada, letreiro and deslizou, all of which are probably unfamiliar for most native speakers aged 3-5. This could be seen as a fortunate coincidence, if we regard picturebooks as a welcome source of new words that children thus have the chance to learn.
In the instances described above, different reasons or translation options account for particular kinds of modulation or mutation: a change from highfrequency words in the source language to low-frequency words in the target language. In many situations, though, the rare word in the translated picturebook corresponds to an equally rare word in the original text, and in such situations we might say that the process in question is translation rather than modulation (transcreation). In such cases, the words in question sometimes belong to technical or scientific lexicon (as with names of animal and plant species, objects in outer space, instruments used in certain activities, types of materials, etc. Such is the case with asteroids, harpies, honeycomb, nymphs, reeds, telescope, tilapia and many more, which inevitably become asteroides, harpias, favos, Ninfas, juncos, telescópio, and Tilápia. It also happens that rare English words result in rare Portuguese words because they are outdated or old-fashioned, as in the case of Beatrix Potter's Kitty in Boots, whose "difficult" words sometimes refer to objects, concepts and practices that children can no longer observe in their daily lives, such as muffs, gun-powder, pellets, sportsman, gaiters and brandishing (Potter, 2016).
Sometimes rare words are used in both original and translation because the author has opted for a formal, literary style, which the translator is naturally compelled to maintain. This is the case with feasted, overjoyed, slithery and spluttered, which are duly translated as banquete, radiantes, serpenteante and balbuciou. We also found that unfamiliar vocabulary in picturebooks may be present because the text refers to abstract realities that are not normally conceived by or mentioned to children, but rather belong to a more grown-up conceptual universe, such as ambition, wraiths and confiscated, which translate as ambição, assombrações, and confiscar, respectively.
Another, albeit infrequent reason for rare words to appear in picturebooks-in the original as well as the translation-is the fact that some authors use neologisms, words that they invent for stylistic purposes. This accounts for the appearance of new and strange vocabulary such as mermish, stratosthingy and great, great, greattimes a trillion, zillion, squillion-granny, which in Portuguese is translated as sereiês, estratocoisa and teteteteteterarararavó, respectively.

DISCUSSION
Our findings reveal that picturebooks translated from English to Portuguese are well represented in the PNL list of recommended books for 3-5-year-olds, since they represent 86 out of a total of 167 books recommended for 3-5 year-olds between 2017 to 2020. The 86 picturebooks analyzed have an average book length of 540 words, a lower figure than that reported by . In their study of the 100 most commonly read picturebooks in the United States, they found the average number of words, or the average book length to be about 680 words. In this sense, our findings suggest that the children's books we analyzed are shorter than those typically read to American kids.
This study shows that translated picturebooks present many opportunities for children to hear and learn rare words. A total of 565 rare words are present in the 78 picturebooks that have such words, as Escolex searches revealed. These rare words in the translated picturebooks are non-existent or extremely low-frequency words that 3-5-year-old children are not likely to be exposed to when they encounter the typical written material in textbooks designed for grade one instruction. First grade textbooks are designed to teach children to read and, as such, we can expect that they do not include unfamiliar words or words that are not commonly used. This, however, highlights the importance of exposing children to the less common, rich and varied vocabulary found in picturebooks. This study suggests that the translated books we analyzed, due to the choices made by translators -the use of 22% of rare words that were not in the original texts -are a good source of vocabulary learning for young children.
Moreover, the linguistic corpus we examined includes very few repetitions of rare words. This is in accord with previous findings that stories, when compared to playtime, for example, include more rare words (Montag et al., 2018). In our study, we find an average of 6.6 rare words per picturebook. This figure is higher than that reported in studies of the amount of rare words in CDS (Rowe, 2012;Weizman & Snow, 2001). However, in this study the comparison of interest was text-to-text, the occurrence of rare words in the target language versus counterparts in the source language.
The picturebooks in our sample exhibit a pareto-like distribution of words that resembles a Zipf probabilistic law, which belongs to a type of probability distributions that are commonly labelled as "heavy tailed" and have been subject to much research in the economic and social sciences since the work of Mandelbrot and others (see, e.g., Mandelbrot 1983) about fractal behaviour of natural and human phenomena. This is an interesting finding because it indicates that rare words appear with higher-than-normal frequency in these translated works.
The finding that a little over one fifth (22%) of the rare words identified in Portuguese were created by the translators or had a high-frequency equivalent in English indicates that the very process of translating picturebooks offers young Portuguese children the opportunity to hear rare words. This is an important finding because research shows that "the earlier the age of acquisition of words, the better their memory and processing in adulthood" (Massaro, 2015, p. 515). It goes without saying that Portuguese children can be exposed to new words via literature in their mother tongue or via translations from other source languages. However, the focus in this study was on uncovering the extent to which translated picturebooks from English provide Portuguese children opportunities for word learning. They do, not only because they are picturebooks, but also because of the choices made by translators.
Additionally, in documenting the choices made by translators we describe how they make different options to render the books' meaning to young Portuguese children. In this respect, we found that translators do not prioritize the principle of simplification to the point of favouring the repetition of common words over introducing novel vocabulary, even in picturebooks aimed at young children aged 3-5. The percentage of titles with no new words (9.5%) indicates this, as it would be much higher if the translators' main concern was for the text to be essentially simple, since this priority would lead them to use already known words to ensure understandability. Our analysis allowed us to conclude that Portuguese translators of English picturebooks focus more on the principle of explicitation, even if applying it leads to the use of new vocabulary for children: whenever meaning is more accurately conveyed with an unfamiliar word, they use it-and this is not only true of nouns (which refer to specific animal and plant species, abstract concepts and other uncommon notions in children's lexicon), but also of adjectives and verbs, especially those that refer to emotions, states of mind, attitudes and intentions. Moreover, other motives make translators choose rare vocabulary, such as rhyme, rhythm, humor, and other stylistic exigencies. As to the principle of normalization, we did not find relevant, conclusive evidence to show that it is applied frequently and consistently.
Thus, with regard to the vision that picturebooks for children-and their translations-are written in "a 'scaled-down' version of 'language in general', simplified to be made accessible to these young readers" (Thompson & Sealey 2007, p. 2), our findings suggest that both writers and translators often make texts intentionally challenging, vocabulary wise, given that the percentage of new words in translations whose original counterpart is also a rare word is as high as 77%. In addition, we found that translated picturebooks include a considerable percentage of words that are either common, high frequency words or non-existent in the original text: 22%. These instances are the ones that provide evidence that translated picturebooks involve transcreation, the process of adapting the material for a new linguistic and cultural context (Ketola, 2018).
The rare words that result from the process of creating the text in the target language reflect the translators' choices of modulation, mutation, and translation processes. Interestingly, when translation occurs, some of the rare words in the Portuguese version of picturebooks are not cognates of the English original words even when there is a cognate equivalent. For example, when alvoroço is used instead of excitação to translate excitement; when deparou is chosen instead of encontrou to translate encountered; or when prediletos is preferred to favoritos to translate favourite. However, we did find that some of the rare words in the Portuguese translations are cognates of the original words (e.g., extravagância/extravaganza, fascinante/fascinating, platina/platinum, ventríloquo/ventriloquist). The percentage of cognates in the total number of rare words (18%) is in accordance with the lexical similarity between English and Portuguese, which is estimated to be 20.4% (García & Souza, 2014). Nonetheless, translators opt more for the use of cognate words when the words are rare in both languages (95/438) than when they chose a rare word in Portuguese to render the meaning of a high-frequency word in English (7/76).
All of the new words present in our sample of translated picturebooks, whether or not their counterpart in the original text is another rare word, confirm the idea that literature aimed at children contributes to enhance and enlarge their vocabulary, as it resorts to a language that is more varied and sophisticated than oral speech, especially speech directed at children. The fact that children are confronted with unfamiliar words as they listen to stories being told and shown to them is positive in that it enables them to increase their lexical knowledge (Malvern et al., 2004). Plus, it should be noted that their reading experiences are in fact conducted by adults, who play a key role in explaining the meaning of more challenging words, sentences and parts of the texts (Flack et al., 2018). In any case, the number of rare vocabulary in our sample of picturebooks aimed at children 3-5 is not that high, in most cases, an average of 6.6 per book, so it does not compromise understandability of the work as a whole.
It is well documented in the literature that a reader needs to know about 95% of the words in a text to comprehend its meaning (Adams, 2009). Considering that, on average, the translated picturebooks have 6.6 rare words, this seems appropriate as they are not readers themselves but instead hear the text read to them and can look at the illustrations in the picturebooks to construct meaning from text. At the same time, it seems that translators do not allow their version of the text to become so filled with new words that it becomes too complex, unless the original is inevitably rich in rare vocabulary, as is the case with Beatrix Potter's Kitty in Boots, the most lexically challenging book in our sample, which corresponds to the extremely high peak in Figure 1.
These linguistic-based findings can inform future research, education practices and translation studies. First, they provide new information about lexical quantity and type of lexicon used in translated picturebooks. Second, they can inform educationalists/practitioners about the language learning opportunities, specifically the vocabulary they are likely to find in translated picturebooks. Third, they can prompt translators to consider how many words they are/should be using that are likely to be new words children hear.
It would be interesting to have similar studies conducted in different languages. It is possible that translations of picturebooks from other languages might yield similar results. Also, the use of rare words by translators may be associated with the degree of similarity between languages and/or by their training. It would also be interesting to look into whether the same or similar types of rare words are present in picturebooks written by Portuguese authors because we know that the learning of new words is enhanced when children have multiple opportunities to hear the same words used in diverse contexts (Jones, Jones & Recchia, 2012). This repetition of new vocabulary optimizes word learning and contributes to children's passive and active word learning. Research indicates that comprehension (passive knowledge) precedes production (active knowledge) and that 3-and 4-year-olds can understand a word after only a few exposures (Childers & Tomasello, 2002). Studies also show that the more adults expose children to novel words, the more children produce them (Wasik & Hindman, 2014), and that familiarity with a word is achieved after 3 or more exposures (Mikk, 2000).
Since we did not have access to a dataset with a printed corpus similar to Escolex, either for British or American English, we used the Childes dataset to determine if the rare words in Portuguese were also rare words in English. This is a limitation in the present study. Nonetheless, given that CDS contains fewer rare words than picturebooks, if the words are already present in the oral (receptive or productive) vocabulary of 3-5-year-old children, this indicates that they are not rare words in written texts in the source language.
Lastly, the sample of picturebooks is composed of those books recommended by PNL. The list can be thought of as equivalent to a librarian-recommended list. In fact, librarians in public schools are encouraged to acquire books from PNL lists. Nonetheless, this does not ensure that these picturebooks are actually read to children, either in CCLE or in HLE contexts. Thus, future research should look at whether these are representative of picturebooks children are actually exposed to (Kam & Matthewson, 2017) and at how parents and teachers scaffold language development and teach word meanings (Biemiller, 2006). Research clearly shows that pointing to and commenting illustrations and the explanation of word meanings by adults contribute to vocabulary learning (Şimşek & Işıkoğlu Erdoğan, 2021).

Rare words in Portuguese that are high-frequency in
English ( (