Open Source Software and Rhetorical Questions about the Internet's Future

Monday, February 14, 2011

Watson: Man Vs. Machine on 'Jeopardy!'

(Title linked.) Also, read this because it's cool and it deals with metadata/controlled vocabulary.

Learn to write, computer geeks!

I read a lot of posts from computer geeks. Heck, I interact with a lot of computer geeks.

I could begin posting here some discussion of RAID arrays, *nix commands and scripts, or differences between postgresql and mysql, that might illustrate a point that I want to get across. (I really could; it's not that I don't know how.) It's that my reader (I hesitate to put an "s" there because I think Carol is the only person who reads me!) would be bored to tears and tell me to post something interesting.

INSTEAD, I will post about communication. (It seems to be what most interests me, so I'll use my soap box for that.)

I was stalking some computer geeks'/writers' blogs today, or what they called blogs, and it hit me, "Day-um, this is crappy writing." I would link examples here, but I don't want to hurt anyone's feelings because these people are blogging about programming for open source projects--I consider their endeavors valuable, useful, and important to humanity's collective knowledge. I'm a big believer in open source, so don't assume that because I'm critiquing the writing style of open source programmers, it is because I dislike open source or these people's programming.

I began this blog a few years ago with the lofty goal in mind of improving open source application documentation (specifically to improve DSpace documentation, though I've since suffered a loss of faith in DSpace), on the premise that I'd be a good candidate to do so because I understand technology and I know how to string words together. Sadly, real life intervened, and I realized I had little time to commit to an endeavor like that. Instead, I continue to follow the near-incomprehensible musings of a bunch of appie-smiths who don't know the difference between a comma and a semi-colon, and I grit my teeth every time I see a comma splice.

But seriously, I came across one blog this evening that I expected to be slightly more comprehensible, and all I could think was, "What drugs was this guy doing when he wrote this?" While his code seemed solid (and apparently, he's a rock star of a programmer--this is not someone I know personally--I state that here because I don't want any of my computer geek friends seeing this and getting insulted by mistaken assumptions), his explanations of the projects for which he'd created the code was poor, to say the least.

So, computer geeks, learn to write! I'm a geek--granted, not hard-core, but I can follow code enough to troubleshoot it, can dig through logs to figure out where something went wrong--but I _can_ communicate. I know a lot of geeks aren't big into the whole human-contact thing, but I have to admit, when a geek knows how to communicate, the job possibilities are endless, and potential for job satisfaction is higher.

Geeks, go learn to write. Then in your cover letters say, "I'm a geek who knows how to write. I also use deodorant." (Okay, you don't really need that part.) You'll get a job for the writing part alone.

Minor rant. Conclude. End of file.

Friday, January 28, 2011

Usability Testing: What gets tested, what gets usabilified?

Should usability research be performed through grounded theory--where results of a usability test determine how the product gets changed--or through traditional empirical theory (whether that is experimental, quasi-experimental, etc.), where the researcher begins with a specific question or hypothesis and tries to solve this through usability testing? What is tested? When testing in usability, is the item being tested? Yes. Is the user's ability to test the software being tested? Absolutely not.

How, then, does the usability researcher proceed to determine A) the product of the usability test (end result) and B) where the user fits into the usability test?

The product of a usability test should always be the tested item's improved capability to fulfill its intended function; how often should the intended function be adapted when users--who are very smart human beings, generally--use the item in an unintended-by-the-creator fashion?

Usability testing relies upon de-centering the user, giving the user no anxiety about how s/he performs, no leeway to consider whether or not s/he has a place in the product testing. Let me offer a sentence: "The user working with an item provides feedback on the effectiveness of that item's design toward fulfilling an intended purpose." "Working with an item" is the subject of this sentence--NOT the user; similarly, working with the item is the subject of a usability test, and combining multiple workings with the item provides results from which conclusions may be drawn (grounded) or that will respond to the questions asked.

I am fairly new to being on the testing side of usability testing; I've performed multiple usability tests as a user for various friends, and I've often discussed iterative design and usability in my work, simply because these are incredibly important issues in digital curation. But when I go to the testing side of things, I have to de-center myself, to recognize that I am not performing usability testing for any purpose beyond what the software is doing; the user and I (as the tester) are irrelevant and replaceable. What is important is the results of working with the item, and how I as a usability researcher structure those results back toward item redesign. There is not a third product that emerges from this researcher, for instance as a friend and I were discussing, improved software training; there is no room for a third product because usability testing is about the action performed with the item, and how combined actions contribute to an improved item.

Sunday, January 23, 2011

Learning through observation

I am enrolled in a usability research course this semester, and my first assignment for class was to conduct a site visit, to observe someone doing his/her job. This goes directly to usability research because, to perform usability testing, one must learn to observe users, must learn to pay attention to what users do--only to take in, not to assign value, criticize, or make judgment. Just to observe.

I greatly enjoyed the site visit, and I will post my write-up on here, if I get permission from the person I visited. I visited one of my former colleagues, to observe her as she created catalog records for materials; when we worked together, I never had time to do this--I was always simply a beneficiary of her labor, and I had no need to pay attention to how metadata in my system originated with this person. Now, I have learned the process that goes into creating a record--both creating a record for our own university's search engine, and a record with OCLC, that can be localized and adapted to other libraries' holdings. (I have lots of librarian friends who are welcome to post on the actual process that I watched; I'm just stating what I observed, in my own words. Please feel free to say, "You were watching this person do XYZ.")

What hit home even more to me was the goal of user observation. In understanding what people are doing, in recognizing how they make things happen--in whatever job they do, it matters not--could be making smoothies or making metadata--developing a habit of shutting one's critical brain off, that part of the brain that says, "This user did this badly, and should have done this instead, and that's how I would fix it because I know better,"--and being able to get into the habit of watching what people and respecting their work as people who use tools, teaches us how to perform usability research. Usability research is not at all about, "Here's what would fix this product." No. It's about, "Here is how this user uses the product. Here is how the product was designed," and then letting whomever hired the usability researcher know about user habits and design intents; it's not about making judgments and criticisms.

Iterative design--the constant re-evaluation of how users use a product to redesign that product--requires constant observation, constant taking in, constant understanding what people are doing with a product. Iterative design in software and websites is becoming increasingly important, particularly in terms of library/archive websites, which are constantly gaining new information, new data, and which aren't necessarily being redesigned to accommodate this new data. For instance, at ALA in San Diego this year--which I did not attend, merely heard about it--OCLC announced a new tool that harvests data from digital archives and allows their materials to be available on OCLC, which basically means that OCLC now recognizes digital materials (that have proper metadata and OAI-PMH harvest capability) as physical holdings at local libraries. This is huge--and now it is contingent upon local libraries to design their sites to be even more usable, and to constantly revisit their designs--iteratively design based on how users use their information--so that they can support the high traffic that will begin to appear.

It's an exciting time. Questions of quality metadata come heavily into play, and quality metadata will rely upon strong observation of how people search. Not observation with the intent of, "I can fix this whole situation because it's not currently usable," but observation with the intent of, "Here is how users are using this right now. This is what I'm learning and taking in."

Thursday, October 14, 2010

Geographic Representation of Cultural Identity in Digital Projects and Digital Preservation

Digital projects and preservation require a variety of high-level skill sets, a lot of continuing education, and much expensive equipment. If an archive's job is to preserve and provide access to content donated by the public, to maintain that content because it is a snapshot in time of a particular cultural identity, and to provide access to as wide an audience as possible, digitization of an archive's contents makes good sense. An isolated archive that is a hub of a region can hold artifacts from towns and individuals for surrounding miles; unfortunately, it does not necessarily also hold the skill set, the budget, or the recognition of a need for education.

In cases where an archive is isolated, there are a few possibilities that can prevent under-representation of an isolated region's cultural identity:

(I don't include the care-taking approach here because that is the best way for an archive to become under-represented in the digital age.)

-Collaboration: Pooling all resources is probably the smartest way to prevent under-representation of a cultural identity. Collaboration between larger and smaller institutions are cheap, fast, and easy ways to get funding, talent, and publicity.

-New Skills Training: An organization can actively pursue good, inexpensive training from various places, and a retrenching of skills combined with new technological knowledge fosters recognition that a wider world--and a resultant future--exist along with a need that one's collection should be recognized in that wider world.

-Intra-institution skill harvesting: I just attended a digital preservation workshop, and Nancy McGovern, one of the educators, suggested that institutions seek skills by posting skill set needs on an intranet and that this could result in surprising developments.

-Speak to the public: If the public built the archive through donations, educating local populace about the need for publicity of a collection could result in local contribution of the needs of a facility.

These are just a few potential solutions that I'm kicking around, but as I look at them, I see that they all involve some form of collaboration or another. In this economy, a denial of a need for collaboration borders on stupidity at the individual level, and on gross negligence when one considers that the eye-on-the-prize goal is representing and preserving a culture's identity to the greatest extent. Archives in isolated regions might not realize how isolated they are until they try to reach out but don't have the abilities to do so.

I often hear people complain about how the East and West Coasts and big cities are the only places where television shows are set; I think it's a scary prospect to consider 20 years into the future to imagine what will be represented (based on current, good digital preservation planning)--and the holes left by cultures that would have been under-represented. If isolated institutions isolate themselves further through adamant refusal not to collaborate or assign appropriate resources and skills now, history teachers in 20 years might have trouble showing electronic versions of hand-drawn maps of those regions from 150 years prior to that.

Sunday, October 10, 2010

Keeping Up with Your Homework, and Why It's Important

Although I have worked in technology for 14 years--from high school, through college, through grad school, through full-time employment--my educational background, as well as most of my interests, have centered around reading. All my degrees are in one breed of English--creative writing, medieval British literature, and now Technical communication/rhetoric. Like a duck to water, then, I read stuff when I need to learn about it, in digitization and technology. If I need to learn about digital repository trustworthiness, I dig it out of Google--and read it. If I need to know about World of Warcraft Cataclysm and when it comes out--I read it.

Either way, in my opinion, the world is at one's fingertips in the age of the Internet. We can find information about anything, and we can read about it. I consider it the height of irresponsibility--gross negligence, really--if someone trained in a discipline and working in that discipline does not also pursue as much information from the Web as s/he can. I was speaking with a librarian friend the other day about the Trusted Digital Repositories report and checklist because I'm preparing to attend a workshop for which these two things were requested reading. Now, these are things that I've encountered a few times before--first because I read about this information, and then in discussion with other librarians.

I have been repeatedly stunned, however, by the surprise in librarians' faces when I say, "This information is online." I will often receive links from them that contain reports from the early 2000's, and they send this to me as if this would be new information, because it was new to them.

I expressed this, in our conversation about the trusted digital repositories, to my librarian friend, who is also involved with digital initiatives, and he said that it's very common for librarians not to have read up on what they're practicing. I'm shocked by this. While I recognize that my background in various fields of English studies probably advantages me in literacy adaptability and critical thinking skills over, say, someone who is in Biomechanics, I have a lot of trouble understanding why a librarian would not pursue as much literature as is available to learn more about what is a fascinatingly evolving field.

So, if you're a librarian, and you don't read--get off your hiny and get to work. Start reading. What you learned from your MLS studies is going to expire--no matter what, it WILL expire, and you'd better be equipped with all new information you can find--even if it's free, Googled info.

(Thank you, Google, for your Palpatinian influences on organization in structure.)

Tuesday, October 5, 2010

Wikis, Blogs, Articles, Peer-Review Processes

When I was in a meeting today, someone asked for proof that another person is an expert in his field; he requested that this proof appear in the form of a blog post, a wiki, or some sort of other non-peer-review format.

This got me to thinking about the credence that academia has come to attach to Web 2.0 technologies. Another friend of mine recently published an article on Code4Lib (Jason Thomale, google his name; he's brilliant); this created a stir in the metadata, cataloging, and coding realm, much of which was displayed in Web 2.0 technologies, and this all has resulted in professional acclaim for him.

Having grown up with the academic attitude of publish or perish, where peer-reviewed articles are the only valid publication means by which tenure may be judged, I am growing to enjoy this progressive trend.

There are obvious questions.
How do we trust information that is randomly posted online, with no vetting process, and can unvetted information be cited with the same weight and merit as peer-reviewed publications?

I am reminded of the Wikipedia incident in which a man representing himself as a PhD had posted assorted history or art (I can't remember which) information; the public was stunned when it developed that this man was simply an amusing amateur. Was his information any less valid because he had misrepresented his identity? If I remember correctly--and shame on me for not looking this up prior to blogging about it--his information indeed was accurate, contained few errors.

Identity and reputation online are built, and then the real world intrudes. If the real world doesn't jive properly with the online identity one has created, one can find oneself cyber-culturally snubbed. I've discussed online identity manipulation before, but what about online reputation building? Reputation gives credence to what one publishes online; since reputation is the most important currency in the online realm of cyberculture and Web 2.0, the aura of a good reputation can credit one's words more strongly than it can in peer-review publishing circles.

Audience, reputation, and online publishing.