Hierarchical cluster analysis

Tags: No Tags
Comments: 1 Comment
Published on: 19/02/2012

Went through the hierarchical cluster analysis again with Warnar of the existing data matrix. What we did was the following:

  1. Select e.g. the all ‘aims’.
  2. Tick the dendogram-option.
  3. Let the procedure do its work.
  4. The result should be a horizontal dendogram with the number of the cases to the left.
  5. Below distance 10 it is worth to interpret and discuss the clustering.
  6. If I make a dendogram of each dimension I could compare them.
  7. Another way  would be to recode the cases with the clusters they form (at a certain distance) and perform another hierarchical cluster analysis across all the dimensions.
  8. The goal of hierarchical cluster analysis is to arrive or generate at hypotheses that can be tested against new data.
  9. On hypothesis could be that the cases are varying so much across the dimensions that no patterns are found within the dataset.
    1. I quickly tested that with one strong cluster w.r.t. to the ‘aim’ (cluster of cases 45, 53, 19, 41, 44) .
    2. In the clusters of ‘involved participating parties’ these cases were all divided over the different clusters below distance 10.
    3. I have to compare this to the results with comparisons I made between these two dimensions based on the two step clustering (e.g. tabel 5).
  10. I tried to cluster the two dimensions over the raw data but that is not informative anymore, that is:
    1. The result becomes messy:  most clusters are formed above distance 10.
    2. Some small weak clusters form, but it is hard to say -based on the data matrix – what causes it.
  11. So it is better to recode the cases for the clusters they are in wrt 1 dimension and then run the test again on more dimensions. Than I get close to what I did with the two step clustering.

What I have to do is:

  1. To-read the comments again on my concept article and then decide on these tests again.
  2. To re-read my article and find weak spots and hunches.
  3. Run the tests as mentioned above, also try the other output formats (new variables, other formats of output).
  4. Make a little report on that for discussion with Liesbet.

Hierarchical Cluster Analysis continued

Tags: No Tags
Comments: No Comments
Published on: 29/11/2011
  1. Possibility to combine this with ‘multi-dimensional scale analysis’. In Dutch: ‘meerdimensionale schaal analyse’.
  2. Look for articles on dichotomous data matrix combined with hierarchical cluster analysis. Or loglinear models for dichotomous datamatrices.
  3. Name that Frans Hubbard gave me : Pieter Kronenberg.

Protected: Notes on different manipulations and tests on the datamatrix

Published on: 30/09/2011

This content is password protected. To view it please enter your password below:

Possible crosstabs

Categories: 4 Research methods
Tags: No Tags
Comments: No Comments
Published on: 21/09/2011

Since I can not make every crosstab on my data (50+ variables) – and it is not very useful – I will have to make smart combinations. Here some leads:

  1. Investments in affordances but low engagement –> index affordances with number of contributions/ comments.
  2. Also check other hypotheses.
  3. I executed a crosstab-all kind of command and went through them to look for peculiar numbers (V-cramer).
  4. What do we do with the fact that the crosstabs are not ‘pure’, For example involved startup participant ‘individual citizen’ crossed with ‘support by donations’ can be accompanied on ‘both sides’ by other participants or supporters.
  5. I do have asymmetric relations in my data; so I should have used Goodman and Kruskal Tau?

Coding 14 new cases

Categories: 4 Research methods
Tags: No Tags
Comments: No Comments
Published on: 20/09/2011

Here I report on my experiences with coding 14 new cases. This is after having coded about 70 cases that included ones that were not exemplary. I threw 25 out which brought me back to 45 cases and then I used my list of reserves (21) from which I selected 14. I think it is important to document here what it is like to code the new ones after all the experiences up to now. I use this list now:

1. Involved, 2. Aims, 3 Methods, 4. Stories, 5. Affordances, 6. Evolution

  1. It seems to me now that some citizen initiatives do not have such strong aims “sense of place” or none at all (oranjeboompleinbuurt). So I added a code ‘missing’ to ‘aims’.
  2. I do not have a code for advanced social functions (like becoming friend or a group). Flickr has such functions. I added ‘making connections’, but I didn’t go through all the data to check for occurrences. (Memory of East has this too).
  3. Don’t have the affordance ‘map’ yet. And adding your own tags.
  4. Some cases are part of an overall social project or an overall neighborhood website.
  5. Some case set an informal tone, but get formal historical stories (comlumbus neighborhood stories).
  6. I have at least two regional websites in my cases.
  7. I believe by now that countries/ regions  built up local traditions wrt online collections of local stories more or less inspiring each other with storytelling projects.
  8. One of the cases (westpark) mentioned sharing as one of the purposes. So does the buurtwinkelproject.
  9. I am still surprised at a much time I loose with understanding each case in order to be able to code it. Maybe I also get lost a little in the local stories.
  10. With startup I don’t make a distinction between company and individual local social entrepreneurs, although the latter is present about 7 times I think.
  11. In case of elementary schools I coded the youth as ‘citizen’, not as ‘professional’ (which I did with students who were learning how to be a reporter etc).
  12. The difference between formal/informal and everyday/ historical becomes clearer again:
    1. Informal/ formal is how the text on the website talks about – or makes implications about – the style of the stories (e.g. “fascinating memories” is closer to informal then to formal).
    2. Everyday/ historical is sometimes literally in the texts, but can also be paraphrased (e.g. “a flavour of life as it was then” is closer to everyday the to historical).
  13. For example the aims that Buurtwinkels had with their neighborhoodshop-project are not online, but in other documents. In other words: not all the aims will be made explicit on the websites of the cases I have found.
  14. The cats ‘guestbook’ and ‘who knows’ are used often for the same purpose. Or better: the comment function seems to entail these two.
  15. In one case other memory websites and their community are participating partners. I have put the, in ‘other’.
  16. Should I position my field analysis as a pilot study?
  17. I don’t do much with the age of the site, but some of them have been around since 2003 and still active.
  18. Wrt to the contributions: I think I should look at 2010 and 2011 to make a claim about the activity.
  19. News items are counted only when they are not automatically imported from elsewhere (e.g. Floresta).
  20. Note the discrepancies between what’s on the website and what is going behind the scenes. For example buurtwinkels: “Deze site zal bewaard worden door het Amsterdam Museum, ook nadat de tentoonstellingen afgelopen zijn. Uw verhalen gaan dus niet verloren en zullen hopelijk ook in de toekomst door veel mensen gelezen worden.”. I know from the museum this is probably not going to happen.
  21. I am a bit worried about participation. The word means in my context to involve citizens, but in all cases citizens are meant to be involved. There are only a few that mention to let certain isolated groups participate. Those are the ones I coded.

Protected: Notes on aims

Published on: 24/08/2011

This content is password protected. To view it please enter your password below:

To do week August 18-25

Categories: 4 Research methods
Tags: No Tags
Comments: No Comments
Published on: 18/08/2011

The data matrix is a reduction of the available data coded in MaxQDA; e.g. the instances of the categories are not always informative for our questions.

  1. Develop new data matrix.
  2. Adjust code-tree in MaxQDA to this matrix. (number of visitors).
  3. Check and – if necessary – recode each case.
  4. Export to raw data matrix.
  5. Manipulate the raw matrix to the workable one (point 1).

Meeting with Warnar August 3rd

Categories: 4 Research methods
Tags: No Tags
Comments: No Comments
Published on: 18/08/2011

I met with Warnar just before my holiday. We had a good talk about the statistical analyses that I might want to apply, but also about the design of my multiple case study. Here the most important notes I made:

  1. I explained Warnar how I went through a unsuccessful cluster analysis and how I applied a kind of replication logic on descriptions of a few cases (see here for the extended explanation).
  2. Next I explained how ‘engagement’ comes to the foreground as a central concept in both the aims (social design) and the results (resulting behavior) of the cases in terms of which things can be measured or at least compared.
  3. Then we arrived at the following design:
    1. On the organizational (social design) side of the cases the characteristics (variables) are intentional. This data can form clusters of characteristics (reoccurring combinations in otherwise different cases) and of cases (clustered based on similarity in characteristics). There seems to be a relation between the two but I don’t know what that is.
    2. On the results side (resulting behavior) of the characteristics of the cases there is a measure of engagement which can yield a group of high scoring cases, but also subgroups within this group that are built up similarly (e.g. high level of comments, low contributions and low visitors vs low level of comments, low contributions and high number of visitors).
    3. 1 and 2 make it possible to:
      1. For example find design characteristics that co-occur.
      2. Attempt to explain where this comes from.
      3. For example identify find design characteristics that yield high level of engagement.
      4. For example find cases that are similar on the design side, but differ on the results side.
      5. Attempt to explain these findings.
  4. The tests that we discussed were:
    1. Bivariate tests: cross-tabulation.
    2. Multivariate tests:
      1. Cochran Q: comparable with Anova, but then with dichotomous scores 0/1,
      2. Cronbachs Alpha for the variance within the characteristics,
      3. Multiple correspondence analysis (nominal/ categorical measures),
      4. I have made a note that cluster analysis is also an option, but I have to re-check that.
  5. Tip: Siegel – Nonparametric statistics for the behavioral sciences. If I am right then this is connected to descriptive statistics as opposed to inferential statistics.

Protected: Article version July 28th

Published on: 01/08/2011

This content is password protected. To view it please enter your password below:

Playing with SPSS cluster analysis

Tags: No Tags
Comments: No Comments
Published on: 22/07/2011

Checked the “Direct Marketing Tool” cluster analysis withing SPSS with some dummy data (see picture below). Some notes:

  1. Obviously there is a connection calculated between all the features within the clusters and based on the connections over all the clusters: the 0.97’s below come from the fact that there was one case (record) having the values 0-0-0-0-0, which ended up in cluster 2, because all the other cases there had 0-0 on the last two features. So this one was exceptional having 0-0 on the first two features.
  2. I have to look at the algorithm which generates the clusters: when does a set of ‘exceptions’ get ‘strong’ enough ‘together’ to become a cluster themselves?
  3. It is possible to make selections from the cases from the cluster model viewer and use those cases as input for new cluster analysis.
  4. Check this http://en.wikipedia.org/wiki/Cluster_analysis as a starter.
  5. See also Bryman p. 62 pointers to Research in focus.
  6. http://edndoc.esri.com/arcobjects/8.3/Samples/Analysis%20and%20Visualization/Cluster%20Analysis/CLUSTERANALYSIS.htm

«page 1 of 2


Welcome , today is Saturday, 23/06/2018