2007年12月12日 星期三

古騰堡計畫注意事項 - 上傳檔案

根據古騰堡編輯大衛‧魏吉(David Widger)的來信,上傳的檔案還有一些亟待處理的事項取得認證碼後,表示該圖書被認證為公版著作。

內容不合於古騰堡的規定時,不能被接受,就沒有古騰堡文本編號

據大衛‧魏吉表示,以下的兩封信同時寄給相關的參與者;我建議以翻譯軟體譯為中文,並輔以字典,
http://www.google.com/translate_t
大衛‧魏吉歡迎參與者直接回信
由我代為轉達也是很好的管道

以下為此兩信的大意

--->

我,大衛‧魏吉, 是古騰堡計畫的編輯,我的工作是:

  1. 收取志工寄來的檔案
  2. 檢查文句的錯誤
  3. 加上前言、後語及後設資料
  4. 張貼

今天,收到克雷格‧紐比教授轉來27件中文公版著作檔案,稍晚又收到4件,並告知將持續收到更多的,已經把9件加1件貼在網路上,24小時內應該可以上線。

那些未符合古騰堡計畫要求的檔案,我還會找時間細看,並與這些志工直接溝通。

目前,有幾個議題,需要討論
  1. 請不要加入前言後語,我必須刪除它們;這個程序是自動的,不勞志工費心
  2. 部份志工不瞭解上傳檔案的表格,以致提供錯誤的資訊,例如,把譯者填入作者的欄位。
  3. 辨明姓與名。古騰堡計畫的規則是,先給名,再給姓。請這位同學,特別協助我,弄清楚作者的姓與名,Xu Jia-Fu <f780424@????>
  4. 我的編輯器可以讀取utf-8 或 utf-16,但不能看到Big-5;我們的張貼程序可以處理UTF-8,但不能處理UTF-16 及 Big-5。請給utf-8編碼的檔案。
  5. 一個作品祗能上傳一個檔案。
  6. 來自微軟文書處理軟體的 .rtf 或 .doc 格式,必須轉換為 UTF-8 編碼的 .txt 檔。
  7. 有些電腦的瀏覽器(IE 6.0或更舊),不能處理UTF-8,因此在上傳檔案的表格裡,填入的中文都不能被閱讀,請 fanghsuan hsu <poiuy0127@????> 協助我們。
以下的檔案已經處理好,給了古騰堡文本編號,24小時內,可以上線,請確認它們的作者姓及名:
若需更改,請mail告知。23814, 223816, 23817, 23818 已是有效的連結。

真是不幸,我的中文程度和各位的英文程度一樣糟,如克雷格‧紐比教授所述,我們必將克服此障礙,讓中文公版著作大量加入古騰堡計畫裡。

謝謝您幫我轉達上述的意思
請保持連繫

<-----


信件一
Submission of texts in Chinese to Project Gutenberg


David Widger <cdwidger@gmail.com>

2007年12月13日 上午 2:02

收件者: "Prof. Mao" <mao@lins.????>

副本: "Newby"" <"Greg>


Four additional files were received this morning,
one of which was satisfactory and has been posted
on the internet as etext #23827. Email has been
sent to the other contributors describing the problems.

In making the list of problems in my email of
yesterday I forgot to include a very important
one: legibility of the data placed in the PG
upload form. If the data such as the author's
name or the ebook Title is entered into the PG
upload form from a computer which is producing
characters other than of the ASCII, or
iso-8850-1, or UTF-8 then the note forwarded to
us to use in the posting of the file cannot be
read. The data is just non-meaningful
symbols. Without this data we cannot process the
file even though the UTF-8 text is fine in every other way.

This is an example of an upload note
demonstrating this problem which we of course had
to report to the contributor as unsatisfactory.

TITLE=ƒx¶YÕýÁx
EDITOR=¼Òñ˜ —î
CREDIT=http://ef.cdpa.nsysu.edu.tw/ccw/01/yli.htm
NOTIFY=fanghsuan hsu <poiuy0127@hotmail.com>
FILETYPE=other
LANGUAGE=Chinese
FILES=xY.zip


Whatever help you can give us in communicating
the problems I have outlined to your students will be much appreciated.

If you have questions regarding the PG Posting
process or if I have not made myself clear on any
of the points, kindly let me know.

With best regards,

David




信件二
Submission of texts in Chinese to Project Gutenberg


David Widger <cdwidger@gmail.com>

2007年12月12日 上午 7:43

收件者: "Prof. Mao" <mao@lins????>

副本: "Newby"" <"Greg>


Dear Professor Mao,

I am one of the Project Gutenberg Editors who receive files from
volunteers. I and my colleagues check these files for conformity to
PG standards, check them for textual errors (if they are in languages
we read), add the heading and trailing PG metadata and legalize that
you see at the beginning and end of each file--and post them to the internet.

Today I received 27 files in Chinese from volunteers whom Professor
Newby informs me are your students--he also indicates I should expect
many more of these files. Of the twenty-seven submissions there were
nine which, with a bit of work, I was able to post on the
internet. They will appear in the PG catalog in about 24 hours.

I have written email to several of the contributors describing for
their instruction various problems with the nine posted files which I
was able to adjust. As to the other 18 contributors I have not had
time to delve deeply enough into the problems but will do so in the
coming days and communicate directly with them.


There are several issues identified so far:

1. A PG header/trailer attached to the files by some of the
volunteers which needed to be deleted before I could start processing
the file (though sometimes there is more info in these as to author,
etc. than in the PG upload note). Our posting process automatically
adds the PG header and trailer so there is no need for the
contributor to try to add these to their files.

2. The volunteers in many cases seem not to be able to read or
understand the Project Gutenberg upload form thus inserting the wrong
information: for example the Translator's name in some cases entered
as the Author--plus other variations.

3. Determining the proper order for Chinese names. At times I found
the last name first, in English, French German and many other
languages it would be easy for me to correct but with no knowledge of
Chinese I cannot figure out which is which. In my feedback email to
four of the contributors today I have asked for guidelines in
this. The PG standard for the names of author, editor, translator,
illustrator are first name first and last name last. This standard is
necessary so our cataloguers get it right.

4. I can see in my unicode editor what looks like normal Chinese
characters in the files which are utf-8 or utf-16 but the Big-5
character set is beyond me. None of my editors can handle this
charset and I do not even know if any of the files submitted today
are in Big-5. Our automated posting program can handle UTF-8, but
not UTF-16 or Big-5. In those files posted today which came in as
UTF-16 I recoded them to UTF-8 using the recode.exe program before
starting the posting process. Unfortunately there are several files
in the 18 remaining unposted files which are labeled as UTF-8 which
are neither UTF-8 or UTF-16. I cannot determine what charset they
are and will have to ask the volunteer to resubmit them in valid
UTF-8 (or in UTF-16 if they are unable to accomplish the recoding to UTF-8).

5. Also there are several files in the unposted 18 which contain
several or many individual text files. PG can only accept one text
file of any charset for an eBook. That file gets the PG
filenumber. So sets with several text files must have these all
combined into one text file. Likewise with several html files, it is
best if there be just one file.

6. Several files were sent as MS Word files with the .rtf or .doc
extension. PG does include these formats but requires that when we do
so a .txt file (in the case of Chinese files UTF-8 format) also be included.


Please take a look at the following files I posted this morning. Some
or all of these may need to be reposted once I figure out which name
is the last name for PG cataloguing purposes:

http://www.gutenberg.org/2/3/8/1/23814

http://www.gutenberg.org/2/3/8/1/23816

http://www.gutenberg.org/2/3/8/1/23817

http://www.gutenberg.org/2/3/8/1/23818

http://www.gutenberg.org/2/3/8/1/23821

http://www.gutenberg.org/2/3/8/1/23822

http://www.gutenberg.org/2/3/8/1/23823

http://www.gutenberg.org/2/3/8/1/23824

http://www.gutenberg.org/2/3/8/1/23825


If corrections or changes are needed kindly send advice by email and
I will be glad to make the corrections.

I will greatly appreciate your help in instructing your students as
to the PG requirements above. In several of the email back and forth
between some of them it is obvious that they do not understand much
of what I write and of course I have equal difficulty understanding them.

With all this said Prof. Newby, Project Gutenberg and I are delighted
to see some eBooks in our catalog in the Chinese language. With your
help I believe we will be able to post many more.

With kindest regards,

David Widger
http://www.gutenberg.net.au/widger/home.html















I am one of the Project Gutenberg Editors and have received your
files for posting. Thank you very much for this contribution.

You have attached PG header and trailer to the file which is not
necessary as these are automatically produced during the posting
process. In future files there is no need for you to do this.

Not being familiar with Chinese names I am unsure if I have the
proper order of first and last names for the translators, editors,
and for you. Please see my list below and let me know if this is
correct--and if not please make the needed corrections. For example
in your upload note one of the translators is written hui neng shi
and in another place as Shi Hui-Neng, so you can understand my confusion.

Also I do not know who should be named as the author of this work. Is
the author known? If not we would normally put the author in the PG
catalog as Anonymous.

Again many thanks for submitting this file to PG.

David Widger



Uploaded at December 10, 2007 3:38 pm PST by Xu Jia-Fu <f780424@yahoo.com.tw>

TITLE=Liu Zu Tan Jing
SUBTITLE=Original publication year is A.D.713 at Tang Dynasty
AUTHOR=
TRANSLATOR=Shi Hui-Neng and Shi Fa-Hai
EDITOR=Shi Fa-Hai
CREDIT=Xu Jia-Fu
NOTIFY=Xu Jia-Fu <f780424@yahoo.com.tw>
FILETYPE=other
LANGUAGE=Chinese
FILES=33202-0.txt
SIZE=90223
CLEARED=OK 20071205045346shi liu zu tan jing hui neng shi 1992:p



沒有留言: