|
|
Welcome to the Invelos forums. Please read the forum
rules before posting.
Read access to our public forums is open to everyone. To post messages, a free
registration is required.
If you have an Invelos account, sign in to post.
|
|
|
|
Invelos Forums->DVD Profiler: Contribution Discussion |
Page:
1... 5 6 7 8 9 ...12 Previous Next
|
Parsing: Robin Wright Penn |
|
|
|
Author |
Message |
Registered: March 29, 2007 | Reputation: | Posts: 4,479 |
| Posted: | | | | Quoting Dr Pavlov: Quote: We don't care about linking, we just want to be able to do what we want. For linking, the best is to think, and use the parsing that has more chance to be the correct one. Guess it will be right in 90% cases. Blind starting point (1/2/3 or 1//23) will give about 50% errors. | | | Images from movies |
| Registered: March 13, 2007 | Posts: 21,610 |
| Posted: | | | | I just notice a key phrase, Yves, "After that, it depends on each users' collections." Translation, LOCAL, you can do whatever you want LOCALLY. Why is it necessary for you to have the Online match your local, do whatever you want locally. This discussion is not YOUR collection or mine, it is about the Invelos collection.
Skip | | | ASSUME NOTHING!!!!!! CBE, MBE, MoA and proud of it. Outta here
Billy Video |
| Registered: April 7, 2007 | Posts: 357 |
| Posted: | | | | I am really curious as to why we need a starting point at all? The best guess based on cultural norms, statistics, other credits research etc will be right far more often than wrong. Going with 1/2/3 or 1//23 is at best leaving it to luck and probably means most will be wrong. What we have here, and in other threads, is the balance of probability clearly points one way but we have people who resolutely want to stick to what is probably wrong, in case it is right, which is a strange way to run a database. | | | Last edited: by Graveworm |
| | T!M | Profiling since Dec. 2000 |
Registered: March 13, 2007 | Reputation: | Posts: 8,736 |
| Posted: | | | | I have to repeat that, as of yet, we have no "official starting point" for parsing. Sure, I'd like Invelos to set one, and I'd like Invelos to define what kind of documention would be needed to deviate from it, but as it stands, we don't have one. Better yet: I'd even prefer automated parsing, with no ability to deviate from the standard at all. At least that'll get us 100% consistency, and to me, that's much more important than being "correct" - we've never worried about "correct" before, have we? For instance, we're using the "most-credited form" for people's names, even when we know (and can extensively document) that it's wrong. So when we've got our own standard for that, a standard that doesn't necessarily match with what is "correct" or "real", why can't we use a similar consistent standard for this issue as well? These endless debates clearly show that we're never going to work it out among ourselves...
The second point I need to make - once again - is that "documentation" on the parsing of names - the kind of documentation that is acceptable to both sides, that is - virtually doesn't exist. In no more than a handful cases have we ever managed to find something substantial enough to convince everyone, but in all other cases there really is no way whatsoever to determine the "correct" parsing.
Bottom line: if Invelos doesn't address this, most users will simply keep doing as they see fit, meaning multiple, non-linking entries for just about every name consisting of more than two parts for all eternity. |
| Registered: March 13, 2007 | Reputation: | Posts: 906 |
| Posted: | | | | Quoting Graveworm: Quote: I am really curious as to why we need a starting point at all? The best guess based on cultural norms, statistics, other credits research etc will be right far more often than wrong. Going with 1/2/3 or 1//23 is at best leaving it to luck and probably means most will be wrong.
We need a starting point for those that contribute cast/crew but have no idea what is correct. If something differs from the starting point that can be documented through best guesses based on cultural norms, statistics, other credits research etc. | | | The colour of her eyes, were the colour of insanity |
| Registered: March 13, 2007 | Reputation: | Posts: 906 |
| Posted: | | | | On another point, I ran into a problem today with the Norwegian actor Sturla Berg-Johansen (notice the dash between the two last names). It is easily documented that official last name is Berg-Johansen. However, he is never credited with the dash according to the CLT. He is always credited as Sturla Berg Johansen
So here is the problem: In Norway, you can't have two last names. If you want two last names you must use a dash. If you don't use a dash and have three names, the second is defined as a middle name. No matter what.
So how should I parse Sturla Berg Johansen for DVD Profiler purposes? Should it be Sturla/Berg/Johansen since Norwegians aren't allowed to have two last names? Or should it be Sturla//Berg Johansen since his real last name is Berg-Johansen?
(After discussing with a few Norwegian contributors, we ended up on a 1/2/3 parsing) | | | The colour of her eyes, were the colour of insanity | | | Last edited: by reybr |
| Registered: December 10, 2007 | Reputation: | Posts: 3,004 |
| Posted: | | | | I have no problem using 1/2/3 as a default when we don't have any evidence one way or the toher, but if we do have some evidence, as in this case, we should go with where the evidence points. |
| Registered: March 13, 2007 | Posts: 21,610 |
| Posted: | | | | Quoting reybr: Quote: Quoting Graveworm:
Quote: I am really curious as to why we need a starting point at all? The best guess based on cultural norms, statistics, other credits research etc will be right far more often than wrong. Going with 1/2/3 or 1//23 is at best leaving it to luck and probably means most will be wrong.
We need a starting point for those that contribute cast/crew but have no idea what is correct. If something differs from the starting point that can be documented through best guesses based on cultural norms, statistics, other credits research etc. No best guesses, no worst guesses, no guesses PERIOD. That will only continue a broken linking system. ONE start point, if different document PERIOD. And as I have said 1/2/3 is more neutral than any other. Why don't we understand that this is ultimately about trying to make the linking function. Do you or do you NOT want a linking system. Skip | | | ASSUME NOTHING!!!!!! CBE, MBE, MoA and proud of it. Outta here
Billy Video |
| Registered: March 13, 2007 | Posts: 21,610 |
| Posted: | | | | Quoting reybr: Quote: On another point, I ran into a problem today with the Norwegian actor Sturla Berg-Johansen (notice the dash between the two last names). It is easily documented that official last name is Berg-Johansen. However, he is never credited with the dash according to the CLT. He is always credited as Sturla Berg Johansen
So here is the problem: In Norway, you can't have two last names. If you want two last names you must use a dash. If you don't use a dash and have three names, the second is defined as a middle name. No matter what.
So how should I parse Sturla Berg Johansen for DVD Profiler purposes? Should it be Sturla/Berg/Johansen since Norwegians aren't allowed to have two last names? Or should it be Sturla//Berg Johansen since his real last name is Berg-Johansen?
(After discussing with a few Norwegian contributors, we ended up on a 1/2/3 parsing) Is this determined by force of law, reybr. Or merely CUSTOM, if it's custom then it looks like you have someone who doesn't like custom, I suggest a sound beating to bring him into the 1984 utopia of no individualism. Remeber we are not after CORRECT names relative to thwe CLT, so if he is most commonly credited without the hyphen then so be it. I would agree with 1/2/3 if some additional source beyond a single credit can be found that is reliable. Skip | | | ASSUME NOTHING!!!!!! CBE, MBE, MoA and proud of it. Outta here
Billy Video | | | Last edited: by Winston Smith |
| Registered: March 13, 2007 | Posts: 21,610 |
| Posted: | | | | Quoting Ace_of_Sevens: Quote: I have no problem using 1/2/3 as a default when we don't have any evidence one way or the toher, but if we do have some evidence, as in this case, we should go with where the evidence points. If you have evidence then would you not be willing to include that in notes. Don't you think that evidence should be communicated to other users as well. Skip | | | ASSUME NOTHING!!!!!! CBE, MBE, MoA and proud of it. Outta here
Billy Video |
| Registered: March 13, 2007 | Posts: 2,759 |
| Posted: | | | | Quoting T!M: Quote: Better yet: I'd even prefer automated parsing, with no ability to deviate from the standard at all. Then, we could as well go to a single name field. Doesn't make any difference to automatic parsing. |
| Registered: March 13, 2007 | Posts: 2,759 |
| Posted: | | | | Quoting Dr Pavlov: Quote: And as I have said 1/2/3 is more neutral than any other. No, it isn't. You can repeat it as long as you wish. |
| Registered: March 13, 2007 | Posts: 21,610 |
| Posted: | | | | Quoting RHo: Quote: Quoting T!M:
Quote: Better yet: I'd even prefer automated parsing, with no ability to deviate from the standard at all. Then, we could as well go to a single name field. Doesn't make any difference to automatic parsing. I agree. But more importantly, things like changes to link system or Tim's suggestion are not things we can implement, they have to be done by Ken...and then there is THE QUESTION... When. I am trying to work within the framework we have right NOW. We have a broken link system, name parsing is not the only reason for it being broken, but it is certainly a part of it since we have however many Contributors we have and a good number of them don't seem to care about fixing the link system, they simply seem to want to do whatever they wish, so we have no consistent data entry. Without consistent data entry we have no functioning link system at all. Skip | | | ASSUME NOTHING!!!!!! CBE, MBE, MoA and proud of it. Outta here
Billy Video |
| Registered: March 13, 2007 | Posts: 21,610 |
| Posted: | | | | Quoting RHo: Quote: Quoting Dr Pavlov:
Quote: And as I have said 1/2/3 is more neutral than any other. No, it isn't. You can repeat it as long as you wish. Yes, it is, Rho. You can continue to make your unsubstantiated and rather silly remarks as much as wish, but if you don't want to engage in productuive discussion you are wating everyon'e time. Now you make a claim. BACK IT UP. Skip | | | ASSUME NOTHING!!!!!! CBE, MBE, MoA and proud of it. Outta here
Billy Video |
| Registered: April 7, 2007 | Posts: 357 |
| Posted: | | | | Quoting Dr Pavlov: Quote:
And as I have said 1/2/3 is more neutral than any other.
You have said it many times. Just saying surely can't be enough? As far as I can see, all the evidence in this thread, no matter what culture or country we take including the US contradicts this. Can you please supply any evidence to support it? |
| Registered: April 7, 2007 | Posts: 357 |
| Posted: | | | | Quoting Dr Pavlov: Quote:
No best guesses, no worst guesses, no guesses PERIOD. That will only continue a broken linking system. We best guess (assume) all the time. It's not a terrible thing. For example if i were parsing your user name I'd see Dr Pavlov. My best guess is that Dr would be a salutation or title not your first name. Like wise Sir John Gielgud. Sir John is his first name and I doubt there are any contribution notes where they had to provide evidence of his knighthood. |
|
|
Invelos Forums->DVD Profiler: Contribution Discussion |
Page:
1... 5 6 7 8 9 ...12 Previous Next
|
|
|
|
|
|
|
|
|