European Parliament groups and how they vote

Based on a set of 3299 votes obtained through VoteWatch.eu we create a dataset for analysis using Pandas. Using initial data visualisation techniques (like heatmaps), we then obtain an euclidean distance matrix between all political groups pairs. A cluster map is created from the distance matrix using Ward clustering, presenting the way the different groups cluster together in a dendogram that shows the relative distance between all of them.

Using DBSCAN and SpectrumClustering in the computed affinity matrix we obtain the separate groups that can be identified by both methods, and using Multi-dimensional scaling we create 2D and 3D maps of the relative distance between all groups - in total and by Policy Area.

Importing data

This data was extracted from http://votewatch.eu - a site I recommend to anyone to that wants to follow the political activity in the EU; it consists of ~3300 votes from the last quarter of 2020.

eu_v.head()
ID Date Policy Area Name For Against Abstentions Result GUE-NGL S&D Greens/EFA REG EPP ECR IDG NI
0 2324 9/14/2020 Transport & tourism Sustainable rail market in view of COVID-19 ou... 64 0 0 Adopted For For For For For For For For
1 2325 9/14/2020 Budget Draft amending budget no 8: Increase of paymen... 62 0 2 Adopted For For For For For For For For
2 2326 9/14/2020 Regional development Proposal for a Council decision authorising Po... 64 0 1 Adopted For For For For For For For For
3 2327 9/14/2020 Culture & education Effective measures to “green” Erasmus+, Creati... 65 0 0 Adopted For For For For For Against Abstain For
4 2328 9/14/2020 Environment & public health The EU’s role in protecting and restoring the ... 65 0 0 Adopted For For For For For For For For

Each row is a vote and the columns include the political groups position (as defined by VoteWatch.eu), the result and the policy area, amongst others.

eu_v.columns
Index(['ID', 'Date', 'Policy Area', 'Name', 'For', 'Against', 'Abstentions',
       'Result', 'GUE-NGL', 'S&D', 'Greens/EFA', 'REG', 'EPP', 'ECR', 'IDG',
       'NI'],
      dtype='object')

Looking at the data

Information on the political groups can be obtained directly from the European Parliament site (https://www.europarl.europa.eu/about-parliament/en/organisation-and-rules/organisation/political-groups); a very brief description based on the above information and direct quotes (when possible) from their official sites:

  • Group of the European People’s Party (Christian Democrats): “The EPP Group is the largest and oldest group in the European Parliament. A centre-right group, we are committed to creating a stronger and self-assured Europe, built at the service of its people. Our goal is to create a more competitive and democratic Europe, where people can build the life they want.”

  • Group of the Progressive Alliance of Socialists and Democrats: “The S&D Group is the leading centre-left political group in the European Parliament and the second largest. Our MEPs are committed to fighting for social justice, jobs and growth, consumer rights, sustainable development, financial market reform and human rights to create a stronger and more democratic Europe and a better future for everyone.”

  • Renew Europe Group: “There has never been a larger centrist group in the European Parliament. By ending the dominance of the Conservatives and the Socialists, Europeans have given us a strong mandate to change Europe for the better. At a time when the rule of law and democracy are under threat in parts of Europe, our Group will stand up for the people who suffer from the illiberal and nationalistic tendencies that we see returning in too many countries.”

  • Group of the Greens/European Free Alliance: “The Greens/European Free Alliance is a political group in the European Parliament made up of Green, Pirate and Independent MEPs as well as MEPs from parties representing stateless nations and disadvantaged minorities. The Greens/EFA project is to build a society respectful of fundamental human rights and environmental justice: the rights to self-determination, to shelter, to good health, to education, to culture, and to a high quality of life”

  • Identity and Democracy Group: “Identity and Democracy (ID) is a new group, which is the fourth largest one in the current European Parliament”; “The Members of the ID Group base their political project on the upholding of freedom, sovereignty, subsidiarity and the identity of the European peoples and nations. They acknowledge the Greek-Roman and Christian heritage as the pillars of European civilisation.”

  • European Conservatives and Reformists Group: “The ECR Group is a centre-right political group in the European Parliament, founded in 2009 with a common cause to reform the EU based on euro-realism, respecting the sovereignty of nations, and focusing on economic recovery, growth and competitiveness. From its 8 founding Member States with 54 MEPs in 2009, we now have 62 members from 15 EU Member States. The ECR Group is at the forefront of generating forward-looking policy proposals to design a reformed European Union that is more flexible, decentralised and respects the wishes of its Member States. Only an EU that truly listens to its people can offer real solutions to the problems we face today. “

  • The Left group in the European Parliament - GUE/NGL: “Our group brings together left-wing MEPs in the European Parliament. We stand up for workers, environment, feminism, peace & human rights. What unites us is the vision of a socially equitable and sustainable Europe based on international solidarity. The European Union must become a project of its people and cannot remain a project of the elites. We want equal rights for women and men, civil rights and liberties and the enforcement of human rights. Anti-Fascism and anti-racism are also a strong part of the tradition of left movements in Europe.”

There is an additional group listed in the table: NI, which stands for Non-Inscrits: this isn’t strictly speaking a group but it bundles every MEP that doesn’t belong to a group. As per the wikipedia article (https://en.wikipedia.org/wiki/Non-Inscrits) the current MEPs come from different political backgrounds.

To visualise how the different groups vote an initial approach is a simple heatmap; for that end we subset the dataframe on the political groups only and replace the voting indication with numerical values.

The resulting dataframe is simply a list of voting sessions with a numeric indication of each group’s vote:

votes_hm=eu_v[["GUE-NGL","S&D", "Greens/EFA", "REG", "EPP", "ECR", "IDG", "NI"]]
votes_hmn = votes_hm.replace(["For", "Against", "Abstain", "No political line"], [1,-1,0,0])
votes_hmn
GUE-NGL S&D Greens/EFA REG EPP ECR IDG NI
0 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1 1
3 1 1 1 1 1 -1 0 1
4 1 1 1 1 1 1 1 1
... ... ... ... ... ... ... ... ...
3294 -1 -1 -1 -1 -1 -1 1 0
3295 1 1 1 -1 -1 0 -1 1
3296 -1 -1 -1 -1 -1 -1 1 -1
3297 -1 -1 -1 -1 -1 -1 1 -1
3298 0 1 1 1 1 1 -1 1

3299 rows × 8 columns

Using Seaborn (https://seaborn.pydata.org/) we can then visualise it.

import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns


voting_palette = ["#FB6962","#FCFC99","#79DE79"]

fig = plt.figure(figsize=(8,8))
sns.heatmap(votes_hmn,
            square=False,
            yticklabels = False,
            cbar=False,
            cmap=sns.color_palette(voting_palette),
           )
plt.show()
_images/European_Parliament_11_0.png

This visualisation alone can provide some initial insights; for example, the IDG seems to abstain more than the rest, and the ECR group appears to vote against more than the average. In general groups in the centre-right to right-wing seem to vote more Against than the others.

Who votes with whom? Determining convergence in voting

An initial approach to determine how similar or dissimilar the groups are is simply to determine how many times they have voted exactly in the same way, which is what the following table reflects:

import collections

import numpy as np
pv_list = []
#print("Total voting instances: ", votes_hm.shape[0])

## Not necessarily the most straightforard way (check .crosstab or .pivot_table, possibly with pandas.melt and/or groupby)
## but follows the same approach as before in using a list of dicts
for party in votes_hm.columns:
    pv_dict = collections.OrderedDict()
    for column in votes_hmn:
        pv_dict[column]=votes_hmn[votes_hmn[party] == votes_hmn[column]].shape[0]
    pv_list.append(pv_dict)

pv = pd.DataFrame(pv_list,index=votes_hm.columns)
pv
GUE-NGL S&D Greens/EFA REG EPP ECR IDG NI
GUE-NGL 3299 2318 2670 2026 1612 978 940 2433
S&D 2318 3299 2642 2841 2449 1571 1087 2536
Greens/EFA 2670 2642 3299 2370 1943 1184 866 2690
REG 2026 2841 2370 3299 2708 1790 1233 2255
EPP 1612 2449 1943 2708 3299 2162 1504 1947
ECR 978 1571 1184 1790 2162 3299 1916 1265
IDG 940 1087 866 1233 1504 1916 3299 1028
NI 2433 2536 2690 2255 1947 1265 1028 3299

Using a heatmap (but this time for a different purpose and with different options) we can visualise that data in a better way.

fig = plt.figure(figsize=(8,8))
ax = fig.add_subplot()

sns.heatmap(
    pv,
    cmap=sns.color_palette("mako_r"),
    linewidth=1,
    annot = True,
    square =True,
    fmt="d",
    cbar_kws={"shrink": 0.8})
plt.title('European Parliament, identical voting count (2020-08-01 to 2021-01-01)')

plt.show()
_images/European_Parliament_16_0.png

From this we can see, for example, that the party with whom GUE/NGL has converged the least is the IDG (and vice-versa), or that S&D converges more with the REG (and vice-versa). This approach, while already useful, only considers the proximity based on absolute convergence - is there a better way?

The Distance Matrix of the political groups

One improvement is to reflect the differences in voting behaviour: a party that votes In Favour is closer to a party that Abstains than to one that votes Against. Based on this principle we compute the euclidean pairwise distance between all groups and create a distance matrix.

from scipy.spatial.distance import squareform
from scipy.spatial.distance import pdist
import scipy.spatial as sp, scipy.cluster.hierarchy as hc
from itables import show

votes_hmn = votes_hmn

## Transpose the dataframe used for the heatmap
votes_t = votes_hmn.transpose()

## Determine the Eucledian pairwise distance
## ("euclidean" is actually the default option)
pwdist = pdist(votes_t, metric='euclidean')

## Create a square dataframe with the pairwise distances: the distance matrix
distmat = pd.DataFrame(
    squareform(pwdist), # pass a symmetric distance matrix
    columns = votes_t.index,
    index = votes_t.index
)

distmat
GUE-NGL S&D Greens/EFA REG EPP ECR IDG NI
GUE-NGL 0.000000 59.974995 45.155288 69.354164 80.318118 90.972523 89.218832 52.009614
S&D 59.974995 0.000000 48.435524 41.988094 57.323643 77.711003 86.434947 49.879856
Greens/EFA 45.155288 48.435524 0.000000 58.932164 71.833140 86.377080 90.746901 42.000000
REG 69.354164 41.988094 58.932164 0.000000 47.906158 72.166474 83.078276 60.224580
EPP 80.318118 57.323643 71.833140 47.906158 0.000000 60.819405 76.256147 69.598851
ECR 90.972523 77.711003 86.377080 72.166474 60.819405 0.000000 60.149813 82.903558
IDG 89.218832 86.434947 90.746901 83.078276 76.256147 60.149813 0.000000 85.924385
NI 52.009614 49.879856 42.000000 60.224580 69.598851 82.903558 85.924385 0.000000

The findings can be read in a similar way to the previous analysis: for example, the Greens/EFA is closer to the NI group and furthest awy from the IDG, while the EPP is most distant from GUE/NGL and has the REG has the closest group.

This pairwise analysis can fortunately be groups automatically; for this we use Ward clustering to obtain a dendogram that can be combined with a heatmap: a clustermap that has the advantage of automatically reordering the columns and rows to show how the groups are positioned in terms of distance.

## Perform hierarchical linkage on the distance matrix using Ward's method.
distmat_link = hc.linkage(pwdist, method="ward", optimal_ordering=True)

sns.clustermap(
    distmat,
    annot = True,
    cmap=sns.color_palette("Greens_r"),
    linewidth=1,
    #standard_scale=1,
    row_linkage=distmat_link,
    col_linkage=distmat_link,
    figsize=(10,10)).fig.suptitle('European Parliament, euclidean distance and Ward clustering \n(2020-08-01 to 2021-01-01), Clustermap')

plt.show()
_images/European_Parliament_20_0.png

The results are much more readable: we can clearly see that:

  • The first split separates the IDG and ECR from the rest.

  • The next split separates the EPP, REF and S&D (the last two closer together)

  • Finally the GUE/NGL, the Greens/EFA and the NI constitute a separate branch (with GUE/NGL branching out first)

DBSCAN and Spectrum Clustering

An additional line of inquery is to determine, based on the relative affinity, how many groups can be identified, or how do the parties cluster when divided by a fixed number of clusters.

The first step in answering this is to compute the affinity matrix from the distance matrix. We start by normalising the distance matrix.

import numpy as np

distmat_mm=((distmat-distmat.min().min())/(distmat.max().max()-distmat.min().min()))*1
pd.DataFrame(distmat_mm, distmat.index, distmat.columns)
GUE-NGL S&D Greens/EFA REG EPP ECR IDG NI
GUE-NGL 0.000000 0.659265 0.496362 0.762364 0.882883 1.000000 0.980723 0.571707
S&D 0.659265 0.000000 0.532419 0.461547 0.630120 0.854225 0.950121 0.548296
Greens/EFA 0.496362 0.532419 0.000000 0.647802 0.789614 0.949485 0.997520 0.461678
REG 0.762364 0.461547 0.647802 0.000000 0.526600 0.793278 0.913224 0.662008
EPP 0.882883 0.630120 0.789614 0.526600 0.000000 0.668547 0.838233 0.765054
ECR 1.000000 0.854225 0.949485 0.793278 0.668547 0.000000 0.661187 0.911303
IDG 0.980723 0.950121 0.997520 0.913224 0.838233 0.661187 0.000000 0.944509
NI 0.571707 0.548296 0.461678 0.662008 0.765054 0.911303 0.944509 0.000000

We can now obtain the affinity matrix.

affinmat_mm = pd.DataFrame(1-distmat_mm, distmat.index, distmat.columns)
affinmat_mm 
GUE-NGL S&D Greens/EFA REG EPP ECR IDG NI
GUE-NGL 1.000000 0.340735 0.503638 0.237636 0.117117 0.000000 0.019277 0.428293
S&D 0.340735 1.000000 0.467581 0.538453 0.369880 0.145775 0.049879 0.451704
Greens/EFA 0.503638 0.467581 1.000000 0.352198 0.210386 0.050515 0.002480 0.538322
REG 0.237636 0.538453 0.352198 1.000000 0.473400 0.206722 0.086776 0.337992
EPP 0.117117 0.369880 0.210386 0.473400 1.000000 0.331453 0.161767 0.234946
ECR 0.000000 0.145775 0.050515 0.206722 0.331453 1.000000 0.338813 0.088697
IDG 0.019277 0.049879 0.002480 0.086776 0.161767 0.338813 1.000000 0.055491
NI 0.428293 0.451704 0.538322 0.337992 0.234946 0.088697 0.055491 1.000000

We will use Density-based spatial clustering of applications with noise (DBSCAN) to as our data clustering algorithm.

from sklearn.cluster import DBSCAN

dbscan_labels = DBSCAN(eps=1.1).fit(affinmat_mm)
dbscan_labels.labels_
dbscan_dict = dict(zip(distmat_mm,dbscan_labels.labels_))
dbscan_dict
{'GUE-NGL': 0,
 'S&D': 0,
 'Greens/EFA': 0,
 'REG': 0,
 'EPP': 0,
 'ECR': -1,
 'IDG': -1,
 'NI': 0}

We get a simple split that identified the ECR and the IDG on one side, and all the others grouped together on the other cluster.

A different approach is to use Spectral Clustering, an algorithm that can be initialised with a pre-determine naumber of clusters; here we set it at 3.

from sklearn.cluster import SpectralClustering
sc = SpectralClustering(3, affinity="precomputed",random_state=2020).fit_predict(affinmat_mm)
sc_dict = dict(zip(distmat,sc))

print(sc_dict)
{'GUE-NGL': 2, 'S&D': 0, 'Greens/EFA': 2, 'REG': 0, 'EPP': 0, 'ECR': 1, 'IDG': 1, 'NI': 2}

The results are consistent with what one would expect when looking at the previous clustermap:

  • One group with the ECR and IDG

  • One group with the S&D, REG and EPP

  • One group with the GUE/NGL, Greens/EFA and NI

Multidimensional Scaling

Based on what we’ve done above we can now visualise the relative distances between all the groups in a map: this can be achieved by Multi-dimensional scaling, a method that reduces the dimensions while keeping the relative distances.

What this means is that we can reduce to 2 or 3 dimensions and obtain a plot of how close the parties are that maintains the relative distance; we can also use the information obtained from Spectral Clustering in the form of the colours od the data points, thus combining relative distance and clustering.

from sklearn.manifold import MDS

mds = MDS(n_components=2, dissimilarity='precomputed',random_state=2020, n_init=100, max_iter=1000)

## We use the normalised distance matrix but results would
## be similar with the original one, just with a different scale/axis
results = mds.fit(distmat_mm.values)
coords = results.embedding_
coords
## Graphic options
sns.set()
sns.set_style("ticks")

fig, ax = plt.subplots(figsize=(8,8))

plt.title('European Parliament, MDS \n(2020-08-01 to 2021-01-01)', fontsize=14)

for label, x, y in zip(distmat_mm.columns, coords[:, 0], coords[:, 1]):
    ax.scatter(x, y, c = "C"+str(sc_dict[label]), s=250)
    ax.axis('equal')
    ax.annotate(label,xy = (x-0.02, y+0.025))
plt.show()
_images/European_Parliament_33_0.png

This view is perhaps one of the most useful in getting an overview of how the political groups relate to each other based on their voting records.

The 3D equivalent can be seen here:

from sklearn.manifold import MDS
import mpl_toolkits.mplot3d
import random
mds = MDS(n_components=3, dissimilarity='precomputed',random_state=2020, n_init=100, max_iter=1000)

## We use the normalised distance matrix but results would
## be similar with the original one, just with a different scale/axis
results = mds.fit(distmat_mm.values)
coords = results.embedding_
coords
## Graphic options
sns.set()
sns.set_style("ticks")


fig = plt.figure(figsize=(10,10))
ax = fig.add_subplot(111, projection='3d')

fig.suptitle('European Parliament, MDS \n(2020-08-01 to 2021-01-01)', fontsize=14)
ax.set_title('MDS with Spectrum Scaling clusters (3D)')

for label, x, y, z in zip(distmat_mm.columns, coords[:, 0], coords[:, 1], coords[:, 2]):
    #ax.scatter(x, y, c = "C"+str(sc_dict[label]), s=250)
    ax.scatter(x, y, z, c="C"+str(sc_dict[label]),s=250)
    annotate3D(ax, s=str(label), xyz=[x,y,z], fontsize=10, xytext=(-3,3),
               textcoords='offset points', ha='right',va='bottom')  
plt.show()
/tmp/ipykernel_134885/115050311.py:15: MatplotlibDeprecationWarning: 
The M attribute was deprecated in Matplotlib 3.4 and will be removed two minor releases later. Use self.axes.M instead.
  xs, ys, zs = proj_transform(xs3d, ys3d, zs3d, renderer.M)
_images/European_Parliament_36_1.png

MDS per Policy Area

Finally we can apply the 2D MDS and clustering to each individual Policy Area; the approach is the same but applied to a subset of the votes, providing the relative distance of the parties in the different domains.

for area in eu_v["Policy Area"].unique():
    varea=eu_v[eu_v["Policy Area"] == area]
    avotes_hm=varea[["GUE-NGL","S&D", "Greens/EFA", "REG", "EPP", "ECR", "IDG", "NI"]]
    avotes_hmn = avotes_hm.replace(["For", "Against", "Abstain", "No political line"], [1,-1,0,0])
 
    avotes_t = avotes_hmn.transpose()
    apwdist = pdist(avotes_t, metric='euclidean')
    adistmat = pd.DataFrame(
        squareform(apwdist), # pass a symmetric distance matrix
        columns = avotes_t.index,
        index = avotes_t.index)
    adistmat_mm=((adistmat-adistmat.min().min())/(adistmat.max().max()-adistmat.min().min()))*1
    
    aaffinmat_mm = pd.DataFrame(1-distmat_mm, distmat.index, adistmat.columns)

    asc = SpectralClustering(3, affinity="precomputed",random_state=2020).fit_predict(aaffinmat_mm)
    asc_dict = dict(zip(adistmat,asc))   
    
    amds = MDS(n_components=2, dissimilarity='precomputed',random_state=2020, n_init=100, max_iter=1000)
    aresults = amds.fit(adistmat_mm.values)
    acoords = aresults.embedding_
    
    sns.set()
    sns.set_style("ticks")

    fig, ax = plt.subplots(figsize=(8,8))

    plt.title(area, fontsize=14)

    for label, x, y in zip(adistmat_mm.columns, acoords[:, 0], acoords[:, 1]):
        ax.scatter(x, y, c = "C"+str(asc_dict[label]), s=250)
        #ax.scatter(x, y, s=250)
        ax.axis('equal')
        ax.annotate(label,xy = (x-0.02, y+0.025))
    plt.show()    
_images/European_Parliament_39_0.png _images/European_Parliament_39_1.png _images/European_Parliament_39_2.png _images/European_Parliament_39_3.png _images/European_Parliament_39_4.png _images/European_Parliament_39_5.png _images/European_Parliament_39_6.png _images/European_Parliament_39_7.png _images/European_Parliament_39_8.png _images/European_Parliament_39_9.png _images/European_Parliament_39_10.png _images/European_Parliament_39_11.png _images/European_Parliament_39_12.png _images/European_Parliament_39_13.png _images/European_Parliament_39_14.png _images/European_Parliament_39_15.png _images/European_Parliament_39_16.png _images/European_Parliament_39_17.png _images/European_Parliament_39_18.png _images/European_Parliament_39_19.png