Philosophy of Science Portal: Pigeon holing behaviorists?

Thursday, December 10, 2009

Pigeon holing behaviorists?

I am not sure what to make of this. It's relevance seams worthless...just how far can one take statistical data to human development?

"Rare words 'author's fingerprint'"

December 19th, 2009

BBC NEWS

Analyses of classic authors' works provide a way to "linguistically fingerprint" them, researchers say.

The relationship between the number of words an author uses only once and the length of a work forms an identifier for them, they argue.

Analyses of works by Herman Melville, Thomas Hardy, and DH Lawrence showed these "unique word" charts are specific to each author.

The work is published in the New Journal of Physics.

Researchers also suggest each author pulls their works from a hypothetical "meta book". One description of this concept might be a framework for the way an author uses language. It is from this framework that all their works are ultimately derived.

In 1935, the Harvard University linguist George Kingsley Zipf demonstrated a mathematical relationship between the frequency of a word in a text and its rank in the list of an author's most used words.

So, the second most frequent word in a book occurs half as often as the first, the third most frequent occurs one-third as often, and so on.

The rule laid the groundwork for many mathematical analyses of words, in which the Zipf law seemed to be a universal property of English - and by extension, of language itself.

Building on that idea, researchers at Umea University in Sweden have found that language use isn't as universal as Zipf's law might suggest.

They have used a related approach that comes up with a unique identifier for each author.

Hardy measure

Clearly, a longer written work has more unique words - words that appear just once in the text.

However, even the best writer's vocabulary will at some point run out of words that have not yet been used.

The researchers gathered together the complete works of Hardy, Melville, and Lawrence, and measured that dependence - counting the number of new unique words as a particular author's works get longer and longer.

They used sections from books of varying lengths, randomly pulled from novels, alongside shorter works and short stories.

They found that the authors had distinctly different "unique word" curves.

The team suggests that a work by an unknown author could therefore be compared to prior works, with the curve acting as a linguistic "fingerprint".

Source material

The meta book concept proposed by the authors is not simply the list of all the words they know, but also the "distribution" of those words produced by an author, whether in drafting an e-mail or writing War and Peace.

"It doesn't matter if I pull out 10,000 words from a book of 100,000 or from a book of 200,000, I get the same behaviour; you always simply pull a piece out of your very, very big 'meta book', which is just a representation of your style," said Sebastian Bernhardsson, who led the work.

"That story you're writing right now is a piece of that big book and that is what you're pulling out," he told BBC News.

The team will continue the analyses with different English authors, and with authors in different languages. As their collection of fingerprints grows, Mr Bernhardsson said, they will try to identify the authors of anonymous works.

But not every result is a happy one, he added.

"It's a fun and interesting exercise, but I've plotted my own thesis in this sense and it was kind of discouraging comparing to some more famous authors."

Zipf's law

No comments:

Post a Comment

Poet colleague

Annus mirabilis-1905

March is a time of transition
winter and spring commence their struggle
between moments of ice and mud
a robin appears heralding the inevitable
life stumbling from its slumber
it was in such a period of change in 1905
that the House of Physics
would see its Newtonian axioms
of an ordered universe collapse
into a new frontier
where the divisions of time and space
matter and energy
were to blend as rain and wind
in a storm that broke loose
within the mind of Albert Einstein
where Brownian motion danced
seen and unseen, a random walk
that became his papers marching through science
reshaping the very fabric of the universe
we have come to know
we all share a common ancestor
a star long lost in the eons of memory
and yet in that commonality
nature demands a permutation
a perchance genetic roll of the dice
which births a new vision
lifting us temporarily from the mystery
exposing some of the roots to our existence
only to raise a plethora of more questions
as did the papers of Einstein in 1905

TIMRAY
SAN DIEGO 9/05

Philosophy of Science Portal

Royal Astronomical Society of Canada

International Archaeology Day 2013

James Webb Telescope

Seasonal Cheer 2012

Thanksgiving 2012

Halloween 2012

Japan Aerospace Exploration Agency

United Kingdom Space Agency

Dawn Satellite

Stanford Linear Accelerator Center 2012

Norwegian Space Agency

Curiosity...August 5th/6th, 2012

International Year of Sustainable Energy for All 2012

International Year of Chemistry 2012

International Year Of Forests 2011

Introduce a Girl to Engineering Day 2011

Pluto--July 14th, 2015

The Sun--2018

Global Astronomy Month 2010

International Year Of Biodiversity 2010

International Year of Astronomy 2009

Celebrating Darwin's Legacy 2009

World Year of Physics 2005

Bruna P. W. de Oliveira--Physicist

Andrea Byrnes--Egyptologist

Laurel Kornfeld--Journalist, amateur astronomer, and advocate for Pluto

Ophelia

"Coupling" 1976 [Gum-bichromate]

"McCalls October 1929" 1979

Oxford

WHB 710am

"POSP" Stringer and poet...Tim Ray

Einstein

Princess

Support your local...

Blog Archive

About Me

Thursday, December 10, 2009

Pigeon holing behaviorists?

No comments:

Poet colleague

Colleagues

Disclaimer