Difference between revisions of "Korean Wiki Project talk:Current Projects"

From Korean Wiki Project
Jump to: navigation, search
(new version of script)
 
(9 intermediate revisions by 2 users not shown)
Line 1: Line 1:
I just started reformatting a few pages and I have some problems/questions. I'm not sure whether this is where I should write it, but I guess it doesn't matter that much.
+
==Phase 1==
 +
*Old discussion for phase 1: [[Korean_Wiki_Project_talk:Current_Projects/phase1]]
  
1. When I upload (through the [[Special:MultipleUpload]]) it automatically adds a line saying "Category:", even though I already added the "Category:Hanja characters" in the summary field. See [http://www.koreanwikiproject.com/wiki/index.php?title=File:茉.png this].
+
==Phase 2==
I don't think that bit of text makes that much difference or if it messes anything up but I'm curious whether it's something that should be changed.
+
Discussion on what we'd like to get done for phase 2. Please add your ideas below:
 +
*Extend the hanja template to have an optional parameter for linking to related characters, for example 水(수 - water) can link to 氷(빙 - ice).
 +
*It'd be nice to have some kind of page where you input a word, and it will put it on all the pages that have the word instead of having to add it manually to each page. For example 입구 <入口> would be added for pages of 입 and 구. This would probably require a bot and more research needs to be done: http://en.wikipedia.org/wiki/Wikipedia:Creating_a_bot
 +
:In this case it may be useful to have a central place where these words can be added (maybe somewhere in the vocabulary section?). After being added the words can then automatically be distributed over the hanja. Just thinking out loud, as it may be hard to maintain if every change to a definition will have to be replicated at numerous places --[[User:Ikzelf|Ikzelf]] 22:56, 13 May 2010 (UTC)
  
2. Too many examples in the hanja box seems to break it.  
+
*A script to extract Korean names for each character, probably from naver. By Korean name for the character I mean like 日 is called "날 일" to distinguish it from 一 (한 일).
Compare this [[말]] to [http://www.koreanwikiproject.com/wiki/index.php?title=%EB%A7%90&oldid=12663 this]. The maximum amount of lines seems to be 6. Is this fixable, or should I just not put in more examples than 6?
+
*A script to extract missing definitions for each character, probably from wiktionary.
::<font color=brown>I looked into this with the notion that you had misplaced or forgotten a formatting mark. Sure enough, I found a " > " in one of the Hanja links. After making this adjustment, the example partition of the Hanja boxes worked perfectly. If you encounter this again, please don't hesitate to make a note of it here.
+
:I think Wiktionary uses CC by-sa, while KWP uses by-nc-sa. I think these licenses are incompatible? If we would use Wiktionary, would it be an option to put the Hanja pages under by-sa? --[[User:Ikzelf|Ikzelf]] 22:56, 13 May 2010 (UTC)
:::The problem was at the last character of this line (before the ''- minor clerk''), which was ''coincidentally'' the same line where the problem began: </font><nowiki>**말단 사원 <[[末]][[端 社]][[員> - minor clerk</nowiki><font color=brown> --[[User:Galinaros|Galinaros]] 03:02, 3 April 2010 (UTC)</font>
+
3. Multiple hanja in one box. I came across this these two hanja: [[마#.E9.BA.BB_.E7.97.B2]]. Should there be a comma in between them? If so, the link to Naver breaks (it links to "first hanja"+"comma"). Also, how can I add pictures to this hanja? Just pick the first one?
+
  
4. Should the first letter of the English word in the "meaning" field be capitalised?
+
===Misc===
::<font color=brown>See [http://www.koreanwikiproject.com/wiki/images/e/e8/%EB%9D%BD_page_after.jpg here]. From this, I believe that you are right to assume that the first letter is to be capitalized. --[[User:Galinaros|Galinaros]] 03:02, 3 April 2010 (UTC)</font>
+
*A script that will parse the level 0 - level 8 hanja pages into a nice format.
--[[User:Martanor|Martanor]] 14:31, 2 April 2010 (UTC)
+
*Anki deck cards related to learning hanja
 +
:Great idea, I would love to have such cards --[[User:Ikzelf|Ikzelf]] 22:56, 13 May 2010 (UTC)
  
 +
===Extending template===
  
Thanks for answering Galinaros ^^ . I need to look into that double hanja one, it should probably not be in the same box. I'm not sure on the exact situation of that character, i think a native speaker had added it before. Great work so far Martanor even though we haven't officially started yet, seems like you know what you're doing already. I've been busy with work so I haven't been able to finish the tutorial video. I think i'll make more simple than i originally intended. I still have to show you guys how to pull out all the hanja characters using character map on windows.--[[User:DigitalSoju|DigitalSoju]] 03:58, 3 April 2010 (UTC)
+
{{Hanjabox2
 +
|char =  水          <!--The hanja character-->
 +
|def =              <!--Definition-->
 +
|kor name =        <!--Name of the character in Korean-->
 +
|info =            <!--Additional info-->
 +
<br><br>
 +
'''Related characters:''' [[]], [[永]]
 +
|examples =        <!--Examples, put on the next line after *-->
 +
*(examples)
 +
|real pics =        <!--make the image size 200px please-->
 +
}}
  
 +
==Semantic Wiki==
 +
I've recently heard of something called Semantic wiki, apparently its an extension:
 +
*http://semantic-mediawiki.org/wiki/Help:Inline_queries
  
Thanks for the quick answers. Yeah, I took the liberty of starting on the hanja beginning with 'ㅁ'. I hope that's okay by you. So far I've added pictures to every hanja that already was there but I have a few more questions:
+
It might be useful for hanja pages as we can put all examples into a page, and retrieve it for individual pages, instead of updating them one by one. What do you guys think? Anyone familiar with this?--[[User:DigitalSoju|DigitalSoju]] 04:34, 11 June 2010 (PDT)
 
+
1. Does "Name of character" also include the Korean translation (훈/訓) of the character? And if so, how should it be formatted? [http://www.koreanwikiproject.com/wiki/images/e/e8/%EB%9D%BD_page_after.jpg This] picture shows both the Korean translation (풍류) and pronunciation (락) though they're just separated by a space which doesn't seem like the ideal format.
+
::We'll worry about the Korean translation in the next step, but before you select a character from the menu, u can highlight it and see it's name.
+
2. Do you want us to delete "Want to know more about 한자? Head over to the important 한자 page"?
+
:Yeah delete that
+
3. Do we really have to do the captcha before every page saved? Maybe those who are on the project can be exempt from it.
+
:Captcha shouldn't be on, hmmm thats strange
+
--[[User:Martanor|Martanor]] 14:13, 3 April 2010 (UTC)
+
 
+
 
+
----
+
 
+
 
+
1. I have been deleting the "Want to know more about 한자? Head over to the important 한자 page." at the top of each page, but several times I have come across the <nowiki>{{hanja}}</nowiki> formatting mark. It essentially says the same thing as "Want to know more about ... " but I'm pretty sure it can be edited easily all from one spot rather than changed individually. Would you like these to stay at the top of each page?
+
:Good questions! Originally I was going to make a template instead of typing out that message each time, but for now i think it's kind of useless. I was thinking about putting a hanja header on each of those pages, but i'm not sure.
+
2. When an example's hanja is being linked, should the hanja that is being exhibited in the example be linked as well? For example, an example under the hanja character 格 is "''자격 <[[資]][[格]]> - Qualification''". Should 格 be linked?--
+
[[User:Galinaros|Galinaros]] 20:34, 7 April 2010 (UTC)
+
:I think we should link it for consistency in terms of looks, hopefully the user won't actually click it as it would just bring him to the same spot. --[[User:DigitalSoju|DigitalSoju]] 23:03, 7 April 2010 (UTC)
+
3. I have taken notice that the spacing on each page varies per user who edited it. I think we should formalize the spacing between every hanja entry. I do not wish to offer my style as the correct one, but I will post it below anyway:
+
*At the top of each page, after all <nowiki>{{stub}} and {{disambig}}</nowiki> tags (and any disambiguation information), I press "enter" once and enter the "Total Characters: X" portion of the hanja page.
+
*On the next line after that I place any alert message text (i.e., <nowiki>{{Notice|MESSAGE}} and {{NeedsNativeCheck|REASON}}</nowiki>) even if the wiki displays the actual info in another position.
+
*After all of that information, I leave 2 empty lines (which would be three "enters" from the alert message text line) and then begin the first hanja for that jamo's page. After each hanja entry are 2 empty lines, just like before. Observe:
+
**[http://www.koreanwikiproject.com/wiki/index.php?title=%EA%B2%81&action=edit Example with no extra parts]
+
**[http://www.koreanwikiproject.com/wiki/index.php?title=%EA%B2%AC&action=edit Example with a stub]
+
**[http://www.koreanwikiproject.com/wiki/index.php?title=%EA%B1%B4&action=edit Example with disambiguation info and a needs-native-check alert]
+
Please tell me your thoughts on this. Thanks, --[[User:Galinaros|Galinaros]] 18:38, 8 April 2010 (UTC)
+
:Sounds good, I'll try to pay attention to the consistency of my spacing. --[[User:DigitalSoju|DigitalSoju]] 23:14, 8 April 2010 (UTC)
+
 
+
 
+
----
+
 
+
 
+
I wrote a small PHP script to take care of some of the reformatting. It takes a block of wiki code and returns it with the examples reformatted. It transforms lines of the type:<br />
+
*<nowiki>  *[[장]]녀 <長女> - The oldest daughter</nowiki><br />
+
to:<br />
+
*<nowiki>  *장녀 <[[長]][[女]]> - the oldest daughter </nowiki>
+
 
+
The script is here: http://amstelveen.adowi.com/k_example.php<br />
+
Not sure if it's useful to anyone, but I share it here just in case. --[[User:Ikzelf|Ikzelf]] 00:13, 9 April 2010 (UTC)
+
 
+
Thats great, we also need a parser in the future for pages like this: [[Level 7]]. Someone dumped some hanja and its not formatted ideally. It would be nice to have it formatted properly, but we should agree on a pattern first. Great job! Oh one other thing, is it possible to put an if statement to ignore these characters? - and ―. Those mean there is no hanja character for it, for example: 육개장 <[[肉]]-[[醬]]>. The middle character 개 isn't hanja.--[[User:DigitalSoju|DigitalSoju]] 02:18, 9 April 2010 (UTC)
+
:Hey thanks. I'd be happy to help out with the Hanja Levels at some later time. The dash character '-' was already ignored I think/hope. I've now included the '―' hanja too in the ignore list. So, this should work properly now. --[[User:Ikzelf|Ikzelf]] 22:38, 9 April 2010 (UTC)
+
 
+
Very nice work. I've an urge to ask you to extend it to more use, such as taking the entire thing to its proper formatting and only leaving the user to decide on example-to-hanja conversions (though that might be a little useless due to the varying states in which many of the current hanja articles are). Me trying to throw ideas for improvements usually means I'm quite pleased with it. Good job :D --[[User:Galinaros|Galinaros]] 02:40, 9 April 2010 (UTC)
+
:Ah thanks. I'm not sure I have too much spare time to spend on it ^^. Still, I'm trying to get some additional parsing working over the weekend. Any suggestions for improvements are welcome of course. So, thanks for them! --[[User:Ikzelf|Ikzelf]] 22:38, 9 April 2010 (UTC)
+
 
+
 
+
I made a new version that can help with the conversion of the hanja blocks. It takes a list of Hanja characters and the whole page of wiki code, and can do a fair part of the conversion automatically. It doesn't get things 100% right, and requires a fair amount of manual checking, but for my own uses it's alright now. I think it can reduce the workload for editing a page.
+
 
+
It still gets confused a bit on non-standard cases, but so far the results seem usable as a first pass for converting a page. Let me know what you think, and if you find any errors.
+
 
+
New version is here: http://amstelveen.adowi.com/k_hanja.php
+
--[[User:Ikzelf|Ikzelf]] 23:06, 11 April 2010 (UTC)
+

Latest revision as of 12:34, 11 June 2010

Phase 1

Phase 2

Discussion on what we'd like to get done for phase 2. Please add your ideas below:

  • Extend the hanja template to have an optional parameter for linking to related characters, for example 水(수 - water) can link to 氷(빙 - ice).
  • It'd be nice to have some kind of page where you input a word, and it will put it on all the pages that have the word instead of having to add it manually to each page. For example 입구 <入口> would be added for pages of 입 and 구. This would probably require a bot and more research needs to be done: http://en.wikipedia.org/wiki/Wikipedia:Creating_a_bot
In this case it may be useful to have a central place where these words can be added (maybe somewhere in the vocabulary section?). After being added the words can then automatically be distributed over the hanja. Just thinking out loud, as it may be hard to maintain if every change to a definition will have to be replicated at numerous places --Ikzelf 22:56, 13 May 2010 (UTC)
  • A script to extract Korean names for each character, probably from naver. By Korean name for the character I mean like 日 is called "날 일" to distinguish it from 一 (한 일).
  • A script to extract missing definitions for each character, probably from wiktionary.
I think Wiktionary uses CC by-sa, while KWP uses by-nc-sa. I think these licenses are incompatible? If we would use Wiktionary, would it be an option to put the Hanja pages under by-sa? --Ikzelf 22:56, 13 May 2010 (UTC)

Misc

  • A script that will parse the level 0 - level 8 hanja pages into a nice format.
  • Anki deck cards related to learning hanja
Great idea, I would love to have such cards --Ikzelf 22:56, 13 May 2010 (UTC)

Extending template

Meaning: Name of character:
Additional info:

Related characters: ,

For more information, check out 水 on Hanja Explorer or in the naver dictionary.
水.png
  • (examples)
Click on the pictures to enlarge:



Semantic Wiki

I've recently heard of something called Semantic wiki, apparently its an extension:

It might be useful for hanja pages as we can put all examples into a page, and retrieve it for individual pages, instead of updating them one by one. What do you guys think? Anyone familiar with this?--DigitalSoju 04:34, 11 June 2010 (PDT)