Security sites to bookmark: fireeye, darkmatters.norsecorp and blueliv

New trends in security intelligence services

A traditional marketing element already present in most security providers' Internet presence is a blog on current topics of interest: A smart way to attract readers while announcing their added value as a security company. 

This is the case of three international players. They are relatively new in this sector and they all combine technology solutions with intelligence services: they are FireEye, founded in 2004, Norse, created in 2010 and Blueliv, founded in 2009. The first two even team up together for customers as relevant as the US Department of Energy.

FireEye, the veteran in this field, is a company that quickly grasped, already in 2004, the relevance to the business world of the advance persistent threats (customised cyber attacks, at the end of the day). When these attacks were already hitting the mass media news, they already devised a product and a service to protect companies. 

FireEye offers two blogs:

- ThreatResearch talks about current Internet threats. I recommend a visit to those who want to know about technical details of new malware campaigns and espionage operations that come to light.

- ExecutivePerspectives, less technical, is focused on business matters. It raises awareness among executive managers and budget decision-makers in terms of cyber (in)security.

Let's remember that in 2014 FireEye acquired Mandiant, the
security consulting firm led by Richard Bretjlich.

Norse Corporation offers also both an appliance to install and security intelligence services to hire. In its blog it presents news related to current cyber attacks together with their executives' public appearances such as the ones from Sam Glines, Norse co-founder. It also provides a link to a colourful world map with current Internet attacks that seems to be updated in real time. A very effective way to amaze those who do not work in our sector. 

An example of a typical blog post is the one showing the use of Splunk, the popular and successful log search engine, with their security intelligence data feed i.e. the product that provides the data presented in the attack map mentioned above.

Blueliv was founded by Daniel Solis. It value proposition is innovative. Gartner mentioned it in 2015 as a "cool vendor". Its blogs contains targets business people, researchers and industry practitioners. There are also some free resources ranging from datasheets to reports and videos. They also display an impressive cyber threat map

In short, the visit of these three blogs could be a first step for those security professionals willing to get introduced to the security intelligence services arena.

Happy security intelligence gathering!


Book review: The wisdom of crowds - A leveraging tool

The book by James Surowiecki titled "The wisdom of crowds" fell into my hands and I read it during the summer of 2015. These are the main learning points that I drew from its reading.

Disclaimer: By no means this personal and non-comprehensive post aims to replace the reading of the book. This post is a biased set of thoughts, most of them extracted from the book, that went through my mind during the reading of this book.

- Diversity and independence are key to be kept in collective decisions.
- To these two points, add also decentralization and aggregation.
- A decision market is a working method to capture collective wisdom.
- Making a group diverse makes it better at problem solving.
- Expertise is spectacularly narrow.
- A large group of diverse and independent people will come up with better decisions.
- Collective decisions are only wise when they draw from very different information sources.
- Centralization is never the answer. Aggregation is.
- Betting markets are very good at predicting markets.
- Crowds find the way to collectively benefit even without speaking to each other if everyone knows that everybody is trying to make a decision.
- We live in a society in which convention has won over rationality (e.g. why all films costs almost the same in the cinema?).
- Maybe as individuals we do not know where we are going but as a group we can achieve great accomplishments.
- People think that people should be where they deserve. Merit is a key element in accepting reality.
- Vehicle traffic: Very easy to create traffic jams. Very complex to get rid of them. As a swarm, we drive quicker if we coordinate with surrounding vehicles.
- If the traffic jam is massive, no easy solution. Personal thought: Maybe then stop the car and read a book.
- Academic challenges in a collaborative environment are a morale booster.
- Reputation should not become the basis of a scientific hierarchy.
- Sometimes, being a member of a group can make people dumber (especially if the group is small and it has leaders on it).
- Sometimes small groups start already with the conclusions instead of reaching them after an evidence-gathering based process.
- Small group view polarization exists. Hierarquies make it worse even.
- The order in which people speak pays an important role.
- People who think of themselves as leaders will influence groups more than others, even if they lack expertise on what they talk about.
- Groups need an efficient way to aggregate their members' opinions.
- Investors not always behave rationally.
- Investors get emotionally attached to their shares.
- Individual irrationality can create collective rationality.
- On average crowds will give you a better answer than individuals.
- Healthy markets are led both by fear and greed.
- Bubbles and crashes are examples of crowd decisions going wrong.
- Groups are smart only when their information sources are balanced in terms of its ownership.
- All these points can be applied (and they are actually being applied) also into the business world.
- These thoughts justify why democracy is preferred to other organisational systems.

As Infosec professionals, if we can have these points in mind when designing security controls and security awareness sessions, our delivered value will be higher.

Happy crowded reading!

Groups fly!

Economics Book Review: Global Financial Systems: Stability & Risk by Jon Danielsson

How come that an Information Security blog posts now a review of a book dealing with the foundations of modern finance?

If you wonder why, then probably you are starting as an Information Security professional. Good luck to you! Train your psychological resilience

If you will read this post to find out why the reading of this book is recommendable, then surely you have wondered how Information Security can provide value to the business

This book titled Global Financial Systems: Stability and Risk is used by his author, Jon Danielsson, in his lectures about Global Financial Systems in the London School of Economics.

In 19 Chapters and in several weeks' reading time, readers get an first comprehensive idea of what has happened in the last decade and what it is currently happening in this global financial crisis. Not only that, readers get also an understanding on key financial concepts.

This information will be of great help to understand the business functionality of the IT Systems that you will probably pen-test or secure or harden or white-hat hack. And not only in the financial sector, literally in any industry sector somehow related or affected by banks i.e. in all industries.

Chapter 1 deals with systemic risk. Worth being highlighted are the interlinks among different risks and the concept of fractional reserve banking.

I identified four concepts that could have a reflection also in the Information Security field: procyclicality, information asymmetry, interdependence and perverse incentives.

Chapter 2 talks about the Great Depression from 1929 to 1933 and four potential causes such as trade restrictions, wrong monetary policies, competitive devaluations and agricultural overproduction.

Chapter 3 talks about a very special type of risk: endogenous risk. The author mentions a graph on how perceived risk goes in time after actual risk. Very interesting concept to apply also in Information Security.

Chapter 4 deals with liquidity and different models bank follow (or should follow). Liquidity is essential but, reading this chapter, complex. The distinction between funding liquidity and market liquidity is also an eye-opener.

Chapter 5 describes central banking and banking supervision. The origin of central banking dates from 1668 in Sweden and from 1694 in England. The author mentions two key roles in central banking: monetary policy and financial stability.

Chapter 6 teaches us why short-term foreign currency borrowing is a bad idea.

Chapter 7 describes the importance of the fractional reserve system and a concept that it is almost opposite to what information security professionals face on a daily basis: moral hazard (literally, "it is what happens when those taking risks do not have to face the full consequences of failure but they enjoy all benefits of success").

Chapter 8 deals with the complexity of coming up with a smart deposit insurance policy that would avoid "moral hazard" possibilities in a fractional reserve banking system.

Chapter 9 describes the problems that trading actions like short selling can bring into the financial system. An impartial reader of this chapter would see the need to come up with an effective and worldwide trading regulation. Concepts such as a "clearing house" and a "central counterparty" are mentioned.

Chapters 10 and 15: Market participants need to know probabilities to default when engaging in credit activities. These chapters explain securitisation concepts such as Special Purpose Vehicles (SPV), Collateralised Debt Obligation (CDO), Asset Backed Securities (ABS) and Credit Default Swaps (CDS). Could you think of similar concepts being used in Information Security?

Chapter 11 presents the "impossible trinity" i.e. no country is able to pursue simultaneously these three goals: fixed exchange rate, free capital movements and an independent monetary policy. Remember that the biggest market is the foreign exchange market.

Chapter 12 focuses on mathematical models of currency crises. The reader can see how these models evolved and how the global games model was proposed.

Chapter 13 goes through the different sets of international financial regulation i.e. Basel I and Basel II. There is also an appendix referring to the Value-At-Risk model.

Chapter 14 could trigger some discussions. There is a patent political element in bailing banks out. Should governments contribute or not to move private sector bank losses into the public sector?

Chapter 16 shows the need to take into account concepts such as tail risk, endogenous risk and systemic risk. Very very interesting reading for us information security professionals.

Chapter 17, 18 and 19 deal with current developments. Chapter 17 studies the period from 2007 to 2009 of the latest financial crisis, chapter 18 describes efforts taken in developing financial regulations and chapter 19 talks about the current sovereign debt crisis and its relation with the common currency and the challenge of a transfer union i.e. a higher degree of unification.

In addition, the website of the book offers the slides of every chapter, a link to and three additional chapters with updated information on the European crisis, financial regulations and current challenges in financial policy.

Happy risk management!

Risky times


Book Review: Executive Data Science by Brian Caffo, Roger D. Peng and Jeffrey Leek

In the introduction to the Data Science world, one needs to build the right frame surrounding the topic. This is usually done via a set of straight to the point books that I mention or summarise in this blog. This is the third one. All of them appear with the "data science" label.

The third book that I start with is written by Brian Caffo, Roger D. Peng and Jeffrey Leek. Its title is "Executive Data Science". You can get it here. If you need to choose only one among the three books I talked about in this blog, probably the more comprehensive one will be this one.

The collection of bullet points that I have extracted from the book is a way to acknowledge value in a twofold manner: first, I praise the book and congratulate the authors and second, I try to condense in some lines a very personal collection of points extracted from the book.

As always, here it goes my personal disclaimer: the reading of this very personal and non-comprehensive list of bullet points by no means replaces the reading of the book it refers to; on the contrary, this post is an invite to read the entire work.

In approximately 150 pages, the book provides literally the following key points (please consider all bullet points as using inverted commas i.e. they show text coming from the book):

- "Descriptive statistics have many uses, most notably helping us get familiar with a data set".
- Inference is the process of making conclusions about populations from samples.
- The most notable example of experimental design is randomization.
- Two types of learning: supervised and unsupervised.
- Machine Learning focuses on learning.
- Code and software play an important role to see if the data that you have is suitable for answering the question that you have.
- The five phases of a data science project are: question, exploratory data analysis, formal modeling, interpretation and communication.
- There are two common languages for analyzing data. The first one is the R programming language. R is a statistical programming language that allows you to pull data out of a database, analyze it, and produce visualizations very quickly. The other major programming language that’s used for this type of analysis is Python. Python is another similar language that allows you to pull data out of databases, analyze and manipulate it, visualize it, and connected to
downstream production.
- Documentation basically implies a way to integrate the analysis code and the figures and the plots that have been created by the data scientist with plain text that can be used to explain what’s going on. One example is the R Markdown
framework. Another example is iPython notebooks.
- Shiny by R studio is a way to build data products that you can share with people who don’t necessarily have a lot of data science experience.
- Data Engineer and Data Scientist: A data engineer builds out your system for actually computing on that infrastructure. A data scientist needs to be able to do statistics.
- Data scientists: They usually know how to use R or Python, which are general purpose data science languages that people use to analyze data. They know how to do some kind of visualization, often interactive visualization with something like D3.js. And they’ll likely know SQL in order to pull data out of a relational
- is also mentioned as a data science web site.

The authors also provide useful comments on creating, managing and growing a data science team. They start with the basics e.g. "It’s very helpful to right up front have a policy on the Code of Conduct".

- Data science is an iterative process.
- The authors also mention the different types of data science questions (as already mentioned in the summary of the book titled "The Elements of Data Analytic Style".
- They also provide an exploratory data analysis checklist.
- Some words on how to start with modeling.
- Instead of starting to discuss causal analysis, they talk about associational analysis.
- They also provide some tips on data cleaning, interpretation and communication.
- Confounding: The apparent relationship or lack of relationship between A and B may be due to their joint relationship with C.
- A/B testing: giving two options.
- It’s important not to confuse randomization, a strategy used to combat lurking and confounding variables and random sampling, a stategy used to help with generalizability.
- p-value and null hypothesis are also mentioned.
- Finally they link to knit.

Happy data-ing!

Find your way

Book Review: The Elements of Data Analytic Style by @jtleek i.e. Jeffrey Leek

In the introduction to the Data Science world, one needs to build the right frame surrounding the topic. This is usually done via a set of straight to the point books that I will be summarising in this blog.

The second book that I start with is written by Jeffrey Leek. Its title is "The Elements of Data Analytic Style". You can get it here. It is a primer on basic statistical concepts that are worth having in mind when embarking on a scientific journey.

This summary is a way to acknowledge value in a twofold manner: first, I praise the book and congratulate the author and second, I share with the community a very personal summary of the books.

Let me try to share with you the main learning points I collected from this book. As always, here it goes my personal disclaimer: the reading of this very personal and non-comprehensive summary by no means replaces the reading of the book it refers to; on the contrary, this post is an invite to read the entire work.

In approximately 100 pages, the book provides the following key points:

Type of analysis
Figure 2.1, titled the data analysis question type flow chart is the foundation of the book. It classifies the different types of data analysis. The basic one is a descriptive one (reporting results with no interpretation). A step further is a exploratory analysis (will the proposed statements be still valid in a qualitative way using a different sample?).

If this also holds true in a quantitative manner, then we are in an inferential analysis. If we can use a subset of measures to predict some others then we can talk about a predictive analysis. The next step, certainly less frequent, is the possibility to seek a cause, then we are in a casual analysis. Finally, and very rarely, if we go beyond statistics and find a deterministic relation, then those are the mechanistic analysis.

Correlation does not imply causation
This is key to understand. The additional element to really grasp it is the existence of confounding elements i.e. additional variables, not touched by the statistical work we are embarked on, that connect the variables we are studying. Two telling examples are mentioned in the book:
- The consumption of ice cream and the murder rate are correlated. However, there is no causality. There is a confounder: the temperature.
- Shoe size and literacy are correlated. However there is a confounder here: age.

Other typical mistakes
Overfitting: Using a single unsplit data set for both model building and testing.
Data dredging: Fitting a large number of models to a data set.

Components of a data set
It is not only the raw data, but also the tidy data set, a code book describing each of the variables and its values in the tidy data set and a script on how to reach the tidy data set from the raw data.
The data set should be understood even if you, as producer or curator of the data set, are not there.

Type of variables
Continuous, ordinal, categorical, missing and censored.

Some useful tips
- The preferred way to graphically represent data: plot your data.
- Explore your data thoroughly before jumping to statistical analysis.
- Use a linear regression analysis to compare it with the initial scatterplot of the original data.
- More data usually beats better algorithms.

Section 9 provides some hints on how to write an analysis. Section 10 does a similar role on how to create graphs. Section 11 hints how to present the analysis to the community. Section 12 cares about how to make the entire analysis reproducible. Section 14 provides a checklist and Section 15 additional references.

Happy analysis!

Happy stats!