Thursday, September 18, 2014

Understanding databases and building new data systems

rxin/db-readings · GitHub



A list of papers essential to understanding databases and building new
data systems. The list is curated and maintained by Reynold Xin 

Arlington schools announce key findings from ‘Big Data’ competition - The Washington Post

Arlington schools announce key findings from ‘Big Data’ competition - The Washington Post



Arlington Schools, with the help of CK-12 (a Silicon Valley-based
nonprofit) and parent volunteer Aneesh Chopra (Obama's first Chief
Technology Officer), hosted a competition to find predictive models and
trends to improve the school system's graduation rate. Here's what they found

What Is Big Data? - datascience@berkeley

What Is Big Data? - datascience@berkeley



The problem with a term like "big data" is just when you thought you'd
finally got past the endless discussions about what big data is—or
isn't—someone comes along and asks, "just what is big data anyway?" and
it all starts back up. Berkeley's School of Information asked 40
"thought leaders"—ranging from Hal Varian, Google's chief economist, to
our own Jon Bruner—and here's what they said



he three Vs: volume, velocity, and variety.

Before you take over supporting a web application…. - Eric Parvin - Site Home - MSDN Blogs

standard utilities used to understand application level process calls:

TCPview (http://technet.microsoft.com/en-us/sysinternals/bb897437.aspx) will show database and web service calls.

Process Monitor (http://technet.microsoft.com/en-us/sysinternals/bb896645) captures file, registry, and thread activity.

Process Explorer (http://technet.microsoft.com/en-us/sysinternals/bb896653) identifies the PID for a worker process and DLL files loaded in the process.

Perfview (http://www.microsoft.com/en-us/download/details.aspx?id=28567) low impact .NET profiler

http://channel9.msdn.com/Series/PerfView-Tutorial







Before you take over supporting a web application…. - Eric Parvin - Site Home - MSDN Blogs





Future posts will dive
into using log parser to inspect the web application using the IIS logs.
These two queries will provide both number of requests as well as
average response time per hour for the pages. Run the queries on the IIS
logs and output the details from the datagrid to Excel and determine
the load pattern.






Load per hour on the server (includes all pages and HTTP Status Codes):

Logparser
"SELECT quantize(time,3600), count(*) as Frequency from <path to
logs>\*.log GROUP BY quantize(time,3600) order by quantize(time,
3600)" -i:IISW3C -o:datagrid






This query list the page requests per hour that are returning HTTP status code 200 and group by the page:

logparser
"SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)),
avg(time-taken), cs-uri-stem from <path to logs>\*.log WHERE
sc-status=200 GROUP BY cs-uri-stem,
TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) ORDER BY
TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) DESC" -i:IISW3C
-o:datagrid






This query shows the number of
requests and average time for a single page or a grouping of pages.
Just modify the where clause to zero in on a single page:


logparser
"SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)),
avg(time-taken), cs-uri-stem from <path to logs>\*.log WHERE
sc-status=200 AND (cs-uri-stem like '/<pagename>.aspx/%') GROUP BY
cs-uri-stem, TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600))
ORDER BY TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) DESC"
-i:IISW3C -o:datagrid






This query identifies the slowest ASPX pages and can be modified to capture other page types:

logparser
"SELECT TOP 20 cs-uri-stem, avg(time-taken) as AvgTime,
count(cs-uri-stem) as RequestCount FROM <path to logs>\*.log WHERE
(cs-uri-stem like '%.aspx') GROUP BY cs-uri-stem ORDER BY AvgTime DESC"
-i:IISW3C -o:datagrid


New Microsoft Message Analyzer Released - IEInternals - Site Home - MSDN Blogs

New Microsoft Message Analyzer Released - IEInternals - Site Home - MSDN Blogs



If you want to monitor extremely low-level network traffic (e.g. TCP/IP packet flags, HTTPS alert records, etc), then Fiddler
typically cannot help you; you will need to use a packet capture tool
like Wireshark or Microsoft’s Network Monitor (old) or Message Analyzer
(new).



Wednesday, September 10, 2014

Inside Apple’s Live Event Stream Failure, And Why It Happened: It Wasn’t A Capacity Issue

Inside Apple’s Live Event Stream Failure, And Why It Happened: It Wasn’t A Capacity Issue:



While at first I assumed it must be a capacity issue pertaining to Akamai, a deeper look at the code on Apple’s page and some other elements from the event shows that decisions made by Apple pertaining to their website, and problems with how they setup storage on Amazon’s S3 service, contributed the biggest problems to the event.



Apple decided to add some JSON (JavaScript Object Notation) code to the apple.com page which added an interactive element on the bottom showing tweets about the event. As a result, this was causing the page to make refresh calls every few milliseconds. By Apple making the decision to add the JSON code, it made the apple.com website un-cachable.

Thursday, September 04, 2014

Edge Show 116 - Docker on Azure | Edge | Channel 9

Edge Show 116 - Docker on Azure | Edge | Channel 9



With this demo-heavy episode of The Edge
Show, you will learn how to start running Docker on top of Microsoft
Azure IaaS and get to know essential fundamentals of working with
Docker. 

Modern apps matter to IT 3 - Top 3 Reasons

Modern apps matter to IT 3 - Top 3 Reasons



 Modern apps are not just apps that display
as full screen. They offer other differences that can help solve some of
the biggest problems IT pros have had with older .exe-based
applications. Find out how, as Simon May explains the advantages offered
by modern apps.




The app container approach also has remarkable security implications too
that resolve many of the standard headaches that IT get involved with.




You can learn more using this TechNet virtual lab to sideload applications in Windows 8.1 or by downloading the Windows 8.1 evaluation from the eval center.

Visualizing Garbage Collection Algorithms

Visualizing Garbage Collection Algorithms

Mark-sweep eliminates some of the problems of reference count. It can
easily handle cyclic structures and it has lower overhead since it
doesn’t need to maintain counts. 



One thing you may have noticed in the previous animations is that
objects never move. Once an object is allocated in memory, it stays in
the same place even if memory turns into a fragmented sea of islands
surrounded by black. The next two algorithms change that, but with
completely different approaches.
Mark-compact disposes of memory, not by just marking it free, but by moving objects down into the free space.

 

 

Friday, August 29, 2014

How Google can really help news & media | Om Malik

How Google can really help news & media | Om Malik



Google is good at one thing — software — and instead of trying to do
crazy things, why not build tools that help the news ecosystem? Why not
create tools that help data novices make sense of information? Or how
about a smarter, simpler and more nimble analytics tool just for
reporters? (Or simply buy Chartbeat!) 



Google-powered search tool that allows reporters to see in real-time past stories from across the web.



Data-driven feature pieces (they used to call them infographics) were
commonplace in technology and business magazines like Wired and Red
Herring. 

OpenStack Trove Day 2014 Recap: MySQL and DBaaS | Javalobby

OpenStack Trove Day 2014 Recap: MySQL and DBaaS | Javalobby



Tesora introduced their new Database Certification Program
at Trove Day. This new program will ensure a high level of
compatibility between the various participating database vendors and the
Trove project.

Thursday, August 28, 2014

Azure Search Scenarios and Capabilities | Microsoft Azure Blog

Azure Search Scenarios and Capabilities | Microsoft Azure Blog



Users are used to Web search engines, sophisticated ecommerce websites
and social apps that offer great relevance, search suggestions as you
type, faceted navigation, highlighting and more, all with
near-instantaneous response times.



Solid search experiences bring challenges both in the information
retrieval front where you need to deal with text analysis, ranking, etc.
and on the distributed systems front where you have to manage
scalability, reliability, etc.

Wednesday, August 27, 2014

Building Cloud Apps with the Azure WebJobs SDK | Microsoft Azure Blog

Building Cloud Apps with the Azure WebJobs SDK | Microsoft Azure Blog



MS Press has published an e-book based on Scott Guthrie’s presentation, Building Cloud Apps With Windows Azure.
The book consistently features Azure Websites as the default choice for hosting web applications, with one exception where it switches to Cloud Services. In the queue-centric work pattern chapter, the book uses a Worker Role to handle backend processing for the Fix It sample application.

Monday, August 25, 2014

Anywhere, anytime, any device

Wordament part 1



Details on how the Wordament architecture relies on Azure services and
features you may find  familiar (cloud services, blob storage) and
unfamiliar (instance input endpoints). Jason and John also provide
details about developer technologies such as Xamarin, which allows C# to
be the only programming language required for Wordament’s
cross-platform support.



New Versions: Feature that makes the Cloud Services model so attractive is the Virtual
IP (VIP) swap. The VIP swap allows them to stage a new version of their
service every single day as they rebuild and redeploy the site.



Clients and Apps: Each client has a thin hardware abstraction layer (HAL). It contains the
code necessary to display and process the Wordament UI controls, along
with the code that initiates contact and interacts with the Azure cloud
services



Gets: To preserve the integrity of the data, stored procedures are the only
way used to determine stats. As the Wordament guys say, they generate “a
small number of tables, but a large number of stored procedures.”



Synchronization:  If the stacks differ from one role to the next, then the system can
flag the web role in question and redo the ranking. "It’s faster and
easier to start a new web role rather than trying to ‘fix’ it."
according to Jason and John.



Projects and Solutions: The Wordament team uses partial classes to minimize #if directives, and
uses Xamarin for cross-platform C# compilers and .NET runtimes to keep
the code base manageable.



In the Wordament solution layout, there's a separate project per client.

Why developers should get excited about Java 9 | Java programming - InfoWorld

Why developers should get excited about Java 9 | Java programming - InfoWorld



what to expect with JDK 9, which has been targeted for release in early 2016

Thursday, August 21, 2014

What is OpenStack Trove? Trove Day 2014

Tesora, Tesora | SlideShare







Database-as-a-Service with OpenStack Trove | The Path To Open Hybrid Cloud

Code to make music

wavepot

Munich, Germany realizes that deploying Linux was a disaster, going back to Windows - Neowin

Munich, Germany realizes that deploying Linux was a disaster, going back to Windows - Neowin



By 2011, Germany was supposedly running LiMux on more than 9000 machines.



Issues arose when the Linux OS users tried to work with those outside
the city and they were unable to share files easily with those on other
applications. More so, the idea is generally that Linux setups are
cheaper than a Microsoft solution as you do not have to pay licensing
fees but what Munich experienced is that Linux was much more expensive.
Why is it more expensive? That's because the city had to hire
programmers to build out functionality that they needed and then had to
pay the staff to maintain the software.